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PLANT RETROELEMENTS AND METHODS RELATED THERETO 

This application is a continuation in part to U.S. Patent Application Serial 
Number 09/322,478, which application was filed May 28, 1999, which application 
claimed priority to U.S. Provisional Patent Application Serial Number 60/087125, 
filed May 29, 1998. 

The present invention was funded, in part, by the United States Department 
of Agriculture, Contract Number IOW03120; the United States Government may 
have certain rights in the invention. 

FIELD OF THE INVENTION 

The present invention provides plant retroelements and methods related to 
plant retroelements. The invention involves techniques from the fields of: molecular 
biology, virology, genetics, bioinformatics, and, to a lesser extent, other related 
fields. 

BACKGROUND OF THE INVENTION 

The eukaryotic retrotransposons are divided into two distinct classes of 
elements based on their structure: the long terminal repeat (LTR) retrotransposons 
and the LINE-like or non LTR elements. Doolittle et al. (1989) Quart. Rev. Biol. 64: 
1-30; Xiong and Eickbush (1990) EMBO J 9: 3353-3362. These element classes are 
related by the fact that each must undergo reverse transcription of an RNA 
intermediate to replicate, and each generally encodes its own reverse transcriptase. 
The LTR retrotransposons replicate by a mechanism which resembles that of the 
retroviruses. Boeke and Sandmeyer, (1991) Yeast transposable elements. In The 
Molecular and Cellular Biology of the Yeast Saccharomyces, edited by J. Broach, 
E. Jones and J. Pringle, pp. 193-261. Cold Spring Harbor Laboratory, Cold Spring 
Harbor, N. Y. They typically use a specific tRNA to prime reverse transcription, and 
a linear cDNA is synthesized through a series of template transfers that require 
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redundant LTR sequences at each end of the element mRNA. This all occurs within 
a virus-like particle formed from proteins encoded by the retrotransposon mRNA. 
After reverse transcription, an integration complex is organized that directs the 
resulting cDNA to a new site in the genome of the host cell. 

Phylogenetic analyses based on reverse transcriptase amino acid sequences 
resolve the LTR retrotransposons into two families: the Ty3/gypsy retrotransposons 
(Metaviridae), and the Tyl/copia elements (Pseudoviridae). Boeke et al., (1998) 
Metaviridae. In Virus Taxonomy: ICTV Vllth Report, edited by F. A. Murphy. 
Springer- Verlag, New York; Boeke et al. (1998) Pseudoviridae. In Virus Taxonomy: 
ICTV Vllth Report, edited by F. A. Murphy. Springer Verlag, New York.; Xiong and 
Eickbush (1990) EMBO J. 9: 3353-3362. Although distinct, Ty3/gypsy elements are 
more closely related to the retroviruses than to the Tyl/copia elements. They also 
share a similar genetic organization with the retroviruses, principally in the order of 
integrase and reverse transcriptase in their pol genes. For the Ty3/gypsy elements, 
reverse transcriptase precedes integrase, and this order is reversed for the Tyl/copia 
elements. In addition, some Ty3/gypsy elements have an extra open reading frame 
(ORF) similar to retroviral envelope (env) proteins, which is required for viral 
infectivity. The Drosophila melanogaster gypsy retrotransposons encode an env-like 
ORF and can be transmitted between cells. Kim et al (1994) Proc. Natl. Acad. Sci. 
USA 91: 1285-1289; Song et al. (1994) Genes & Dev. 8: 2046-2057. Thus there are 
two distinct lineages of infectious LTR retroelements, the retroviruses and those 
Ty3/gypsy retrotransposons that encode envelope-like proteins. The Ty3/gypsy 
elements have been divided into two genera, the rotaviruses and the errantiviruses, 
the latter of which include all elements with env-like genes. Boeke et al, (1998) 
Metaviridae. In Virus Taxonomy: ICTV Vllth Report, edited by F. A. Murphy. 
Springer- Verlag, New York 

In plants, retrotransposons have been extremely successful. Bennetzen (1996) 
Trends Microbiol. 4: 347-353; Voytas (1996) Genetics 142: 569-578. The enormous 
size of many plant genomes demonstrates a great tolerance for repetitive DNA, a 
substantial proportion of which appears to be composed of retrotransposons. 



Because of their abundance, retrotransposons have undoubtedly influenced plant 
gene evolution. They can cause mutations in coding sequences (Grandbastien et al 
(1989) Nature 337: 376-380; Hirochika et al. (1996) Proc. Natl. Acad. Sci. USA 93: 
7783-7788; Purugganan and Wessler (1994) Proc. Natl Acad. Sci. USA 91: 1 1674- 
11678), and the promoter regions of some plant genes contain relics of 
retrotransposon insertions that contribute transcriptional regulatory sequences. White 
et al. (1994) Proc. Natl. Acad. Sci. USA 91: 11792-11796. Retrotransposons also 
generate gene duplications: Repetitive retrotransposon sequences provide substrates 
for unequal crossing over, and such an event is thought to have caused a zein gene 
duplication in maize. White et al. (1994) Proc. Natl. Acad. Sci. USA 91: 11792- 
11796. Occasionally, cellular mRNAs are reverse transcribed and the resultant 
cDNA recombines into the genome giving rise to new genes, or more frequently, 
cDNA pseudogenes. Maestre et al. (1995) EMBO J. 14: 6333-6338. The 
transduction of gene sequences during reverse transcription, which produced the 
oncogenic retroviruses, has also been documented to occur for a plant 
retrotransposon (Bureau et al. (1994) Cell 77: 479-480.; Jin and Bennetzen (1994) 
Plant Cell 6: 1 177 1 186); a maize Bsl insertion in Adhl carries part of an ATPase 
gene and is the only known example of a retrotransposon-mediated gene transduction 
event. 

Plant genomes encode representatives of the two major lineages of LTR 
retrotransposons that have been identified in other eukaryotes. Among these are 
numerous examples of Tyl/copia elements (e.g. Konieczny et al. (1991) Genetics 
127: 801-809; Voytas and Ausubel (1988) Nature 336: 242-244; Voytas et al. (1990) 
Genetics 126: 713-721) Also prevalent are Ty3/gypsy elements, which are members 
of the genus Metaviridae (Smyth et al. 1989; Purugganan and Wessler 1994 Proc. 
Natl. Acad. Sci. USA 91: 1 1674-1 1678; Su and Brown 1997). As stated above, the 
metaviruses do not encode an envelope protein characteristic of the retroviruses. It 
has been suggested that some plant retrovirus-like elements may have lost, or not yet 
gained, genes such as the envelope gene required for cell-to-cell transmission 
(Bennetzen (1996) Trends Microbiol. 4: 347-353). As one group of researchers 
described the uncertainty, "[s]ince genes encoding ENV [envelope] functions are 



very heterogeneous at the sequence level and difficult to identify by homology even 
between retroviruses, the possibility cannot be completely excluded at the present 
time that the 3 ? ORF of Cyclops [the retrotransposon described in the paper] is, in 
fact, an env gene and, hence, Cyclops is a retrovirus or a descendant of one." 
Chavanne et al. (1998) Plant Molecular Biol 37: 363-375. 

Citation of the above documents is not intended as an admission that any of 
the foregoing is pertinent prior art. All statements as to the date or representation as 
to the contents of these documents is based on subjective characterization of 
information available to the applicant, and does not constitute any admission as to 
the accuracy of the dates or contents of these documents. 

SUMMARY OF THE INVENTION 

In general, the present invention provides materials, such as nucleic acids, 
vectors, cells, and plants (including plant parts, seeds, embryos, etc.), and methods 
to manipulate the materials. In particular, molecular tools are provided in the form 
of retroelements and retroelement-containing vectors, cells and plants. The 
particular methods include methods to introduce the retroelements into cells, 
especially wherein the retroelements carries at least one agronomically-significant 
characteristic. The best mode of the present invention is a particular method to 
transfer agronomically-significant characteristics to plants wherein a helper cell line 
which expresses gag, pol and env sequences is used to enable transfer of a secondary 
construct which carries an agronomically-significant characteristic and has 
retroelement sequences that allow for replication and integration. 

In one embodiment, there are provided isolated nucleic acid molecules, 
wherein said nucleic acid molecules encode at least a portion of a plant retroelement 
and comprises a nucleic acid sequence selected from the group consisting of: 



(a) a nucleic acid sequence which is a plant retroelement primer binding site and 
which has more than 95% identity to SEQ ID NO 2, wherein said identity can be 
determined using the DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which is at least a portion of a plant retroelement 
envelope sequence and which has more than 50% identity to SEQ ID NO 5, wherein 
said identity can be determined using the DNAsis computer program and default 
parameters; 

(c) a nucleic acid sequence which is at least a portion of a plant retroelement gag 
sequence and which has more than 50% identity to SEQ ID NO 7, wherein said 
identity can be determined using the DNAsis computer program and default 
parameters; 

(d) a nucleic acid sequence which is at least a portion of a plant retroelement 
integrase sequence and which has more than 70% identity to SEQ ID NO 9, wherein 
said identity can be determined using the DNAsis computer program and default 
parameters; 

(e) a nucleic acid sequence which is at least a portion of a plant retroelement 
reverse transcriptase sequence and which has more than 70% identity to SEQ ID NO 
1 1, wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(f) a nucleic acid sequence which is at least a portion of a plant retroelement 
protease sequence and which has more than 50% identity to SEQ ID NO 13, wherein 
said identity can be determined using the DNAsis computer program and default 
parameters; 

(g) a nucleic acid sequence which is at least a portion of a plant retroelement 
RNAseH sequence and which has more than 70% identity to SEQ ID NO 15, 



wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(h) a nucleic acid sequence which is at least a portion of a plant retroelement 
sequence and which has more than 50% identity to SEQ ID NO 17, wherein said 
identity can be determined using the DNAsis computer program and default 
parameters; 

(i) a nucleic acid sequence which is selected from the group consisting of: SEQ ID 
NO 2; SEQ ID NO 5; SEQ ID NO 7; SEQ ID NO 9; SEQ ID NO 1 1; SEQ ID NO 13; 
SEQ ID NO 15; and SEQ ID NO 17. 

(j) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement envelope sequence and has more than 30% identity 
to SEQ ID NO 6, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(k) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement gag sequence and has more than 30% identity to 
SEQ ID NO 8, wherein said identity can be determined using the DNAsis computer 
program and default parameters; 

(1) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement integrase sequence and has more than 75% identity 
to SEQ ID NO 10, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(m) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement reverse transcriptase sequence and has more than 
79% identity to SEQ ID NO 12, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 



5 (n) a nucleic acid sequence which encodes an amino acid sequence which is at least 

a portion of a plant retroelement protease sequence and has more than 55% identity 
to SEQ ID NO 14, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

10 (o) a nucleic acid sequence which encodes an amino acid sequence which is at least 

a portion of a plant retroelement RNAseH sequence and has more than 90% identity 
to SEQ ID NO 16, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

15 (p) a nucleic acid sequence which encodes an amino acid sequence which is at least 

a portion of a plant retroelement sequence and has more than 40% identity to SEQ 
ID NO 18, wherein said identity can be determined using the DNAsis computer 

o program; 



f§ (q) a nucleic acid sequence which encodes an amino acid sequence selected from the 

fj group consisting of: SEQ ID NO 4; SEQ ID NO 6; SEQ ID NO 8; SEQ ID NO 10; 

O SEQ ID NO 12; SEQ ID NO 14; SEQ ID NO 16; and SEQ ID NO 18; 

Q (r) a nucleic acid sequence which encodes an allelic variant of an amino acid 

sequence selected from the group consisting of: SEQ ID NO 4; SEQ ID NO 6; SEQ 
|i ID NO 8; SEQ ID NO 10; SEQ ID NO 12; SEQ ID NO 14; SEQ ID NO 16; and 

g SEQ ID NO 18; and 

(s) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
30 from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 

of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); a nucleic acid 
sequence of (e); a nucleic acid sequence of (f); a nucleic acid sequence of (g); a 
nucleic acid sequence of (h); a nucleic acid sequence of (i); a nucleic acid sequence 
of (j); a nucleic acid sequence of (k); a nucleic acid sequence of (1); a nucleic acid 
35 sequence of (m); a nucleic acid sequence of (n); a nucleic acid sequence of (o); a 
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nucleic acid sequence of (p); a nucleic acid sequence of (q); and a nucleic acid 
sequence of (r). 

Seeds and plants comprising a nucleic acid as above are particularly provided. 
Nucleic acid molecules as above which comprise gag, pol and env genes and which 
comprise adenine-thymidine-guanidine as the gag gene start codon are also 
particularly provided. Those which comprise gag, pol and env genes, the adenine- 
thymidine-guanidine as the gag gene start codon, and which further comprises SEQ 
ID NO 4 are also provided. 

Plant envelope sequences and constructs which comprise the sequences are 
provided, as are cells, seeds, embryos and plants comprising them. Preferred are 
isolated nucleic acid molecules, wherein said nucleic acid molecules encode at least 
a portion of a plant envelope sequence and comprises a nucleic acid sequence 
selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 90% identity to SEQ ID NO 5, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes SEQ ID NO 5; 

(c) a nucleic acid sequence which encodes an amino acid sequence which has 
greater than 85% identity to SEQ ID NO 6, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 

(d) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 6; 

(e) a nucleic acid sequence which encodes an allelic variant of SEQ ID NO 6; and 

(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
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of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Plant cells comprising an isolated nucleic acid molecule above are particularly 
preferred. Also preferred are plant envelope proteins comprising an amino acid 
sequence encoded by the above. Methods to impart agronomically-significant 
characteristics to at least one plant cell are also provided, comprising: contacting a 
plant envelope protein as described to at least one plant cell under conditions 
sufficient to allow a nucleic acid molecule to enter said cell, wherein said nucleic 
acid molecule encodes an agronomically-significant characteristic. 

Plant integrase sequences and constructs which comprise the sequences are 
provided, as are cells, seeds, embryos and plants comprising them. Preferred are 
isolated nucleic acid molecules, wherein said nucleic acid molecules encode at least 
a portion of a plant integrase sequence and comprises a nucleic acid sequence 
selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 90% identity to SEQ ID NO 9, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes SEQ ID NO 9; 

(c) a nucleic acid sequence which encodes an amino acid sequence which has 
greater than 85% identity to SEQ ID NO 10, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 

(d) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 10; 

(e) a nucleic acid sequence which encodes an allelic variant of SEQ ID NO 10; and 



(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Plant cells comprising an isolated nucleic acid molecule above are particularly 
preferred. Also preferred are plant integrase proteins comprising an amino acid 
sequence encoded by the above. Methods to impart agronomically-significant 
characteristics to at least one plant cell are also provided, comprising: contacting a 
plant integrase protein as described to at least one plant cell under conditions 
sufficient to allow a nucleic acid molecule to enter said cell, wherein said nucleic 
acid molecule encodes an agronomically-significant characteristic. 

Plant reverse transcriptase sequences and constructs which comprise the 
sequences are provided, as are cells, seeds, embryos and plants comprising them. 
Preferred are isolated nucleic acid molecules, wherein said nucleic acid molecules 
encode at least a portion of a plant reverse transcriptase sequence and comprises a 
nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 90% identity to SEQ ID NO 11, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes SEQ ID NO 11; 

(c) a nucleic acid sequence which encodes an amino acid sequence which has 
greater than 85% identity to SEQ ID NO 12, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 

(d) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 12; 

(e) a nucleic acid sequence which encodes an allelic variant of SEQ ID NO 12; and 
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(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Plant cells comprising an isolated nucleic acid molecule above are particularly 
preferred. Also preferred are plant reverse transcriptase proteins comprising an 
amino acid sequence encoded by the above. Methods to impart agronomically- 
significant characteristics to at least one plant cell are also provided, comprising: 
contacting a plant reverse transcriptase protein as described to at least one plant cell 
under conditions sufficient to allow a nucleic acid molecule to enter said cell, 
wherein said nucleic acid molecule encodes an agronomically-significant 
characteristic. 

Plant RNAseH sequences and constructs which comprise the sequences are 
provided, as are cells, seeds, embryos and plants comprising them. Preferred are 
isolated nucleic acid molecules, wherein said nucleic acid molecules encode at least 
a portion of a plant RNAseH sequence and comprises a nucleic acid sequence 
selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 90% identity to SEQ ID NO 15, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes SEQ ID NO 15; 

(c) a nucleic acid sequence which encodes an amino acid sequence which has 
greater than 95% identity to SEQ ID NO 16, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 

(d) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 16; 
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(e) a nucleic acid sequence which encodes an allelic variant of SEQ ID NO 16; and 

(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Plant cells comprising an isolated nucleic acid molecule above are particularly 
preferred. Also preferred are plant RNAseH proteins comprising an amino acid 
sequence encoded by the above. Methods to impart agronomically-significant 
characteristics to at least one plant cell are also provided, comprising: contacting a 
plant RNAseH protein as described to at least one plant cell under conditions 
sufficient to allow a nucleic acid molecule to enter said cell, wherein said nucleic 
acid molecule encodes an agronomically-significant characteristic. 

Plant retroelement sequences and constructs which comprise the sequences 
are provided, as are cells, seeds, embryos and plants comprising them. Preferred are 
isolated nucleic acid molecules, wherein said nucleic acid molecules encode at least 
a portion of a plant retroelement sequence and comprises a nucleic acid sequence 
selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 95% identity to a nucleic acid 
sequence selected from the group consisting of: SEQ ID NO 2; SEQ ID NO 5; SEQ 
ID NO 7; SEQ ID NO 9; SEQ ID NO 1 1; SEQ ID NO 13; SEQ ID NO 15; and SEQ 
ID NO 17, wherein said identity can be determined using the DNAsis computer 
program and default parameters; 

(b) a nucleic acid sequence which is selected from the group consisting of: SEQ ID 
NO 2; SEQ ID NO 5; SEQ ID NO 7; SEQ ID NO 9; SEQ ID NO 1 1; SEQ ID NO 13; 
SEQ ID NO 15; and SEQ ID NO 17; 
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(c) a nucleic acid sequence which encodes an amino acid sequence which has more 
than 90% identity to an amino acid sequence selected from the group consisting of 
SEQ ID NO 4; SEQ ID NO 6; SEQ ID NO 8; SEQ ID NO 10; SEQ ID NO 12; SEQ 
ID NO 14; SEQ ID NO 16; SEQ ID NO 18, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 

(d) a nucleic acid sequence which encodes an amino acid sequence selected from the 
group consisting of: SEQ ID NO 4; SEQ ID NO 6; SEQ ID NO 8; SEQ ID NO 10; 
SEQ ID NO 12; SEQ ID NO 14; SEQ ID NO 16; and SEQ ID NO 18; 

(e) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence selected from the group consisting of: SEQ ID NO 4; SEQ ID NO 6; SEQ 
ID NO 8; SEQ ID NO 10; SEQ ID NO 12; SEQ ID NO 14; SEQ ID NO 16; and 
SEQ ID NO 18; and 

(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Nucleic acid molecule as above, which further comprises at least one nucleic acid 
sequence which encodes at least one agronomically-significant characteristic are 
preferred. More preferred are those nucleic acid molecules as described wherein the 
agronomically-significant characteristic is selected from the group consisting of: 
male sterility; self-incompatibility; foreign organism resistance; improved 
biosynthetic pathways; environmental tolerance; photosynthetic pathways; and 
nutrient content and those wherein the agronomically significant characteristic is 
selected from the group consisting of: fruit ripening; oil biosynthesis; pigment 
biosynthesis; seed formation; starch metabolism; salt tolerance; cold/frost tolerance; 
drought tolerance; tolerance to anaerobic conditions; protein content; carbohydrate 
content (including sugars and starches); amino acid content; and fatty acid content. 



13 



Seeds and plants comprising a nucleic acid molecule as described are also preferred. 
More preferred are plants as described, wherein the plant is selected from the group 
consisting of: soybean; maize; sugar cane; beet; tobacco; wheat; barley; poppy; rape; 
sunflower; alfalfa; sorghum; rose; carnation; gerbera; carrot; tomato; lettuce; 
chicory; pepper; melon; cabbage; oat; rye; cotton; flax; potato; pine; walnut; citrus 
(including oranges, grapefruit etc.); hemp; oak; rice; petunia; orchids; Arabidopsis; 
broccoli; cauliflower; brussel sprouts; onion; garlic; leek; squash; pumpkin; celery; 
pea; bean (including various legumes); strawberries; grapes; apples; pears; peaches; 
banana; palm; cocoa; cucumber; pineapple; apricot; plum; sugar beet; lawn grasses; 
maple; triticale; safflower; peanut; and olive. Most preferred are plants as described 
which are soybean plants. 

Plant retroelements comprising an amino acid sequence encoded by a nucleic acid 
sequence described are also provided. Plant cells comprising a nucleic acid molecule 
described herein, as well as plant retroviral proteins encoded by nucleic acid 
molecules described herein are provided. 

Moreover, methods to transfer nucleic acid into a plant cell, comprising contacting 
a nucleic acid molecule of the present invention with at least one plant cell under 
conditions sufficient to allow said nucleic acid molecule to enter at least one cell of 
said plant are provided. In particular there is provided, methods to impart 
agronomically-significant characteristics to at least one plant cell, comprising: 
contacting a plant retroelement of the present invention to at least one plant cell 
under conditions sufficient to allow a nucleic acid molecule to enter said cell, 
wherein said nucleic acid molecule encodes an agronomically-significant 
characteristic. Methods as described, wherein the agronomically-significant 
characteristic is selected from the group consisting of: male sterility; self- 
incompatibility; foreign organism resistance; improved biosynthetic pathways; 
environmental tolerance; photosynthetic pathways; and nutrient content and those 
wherein the agronomically significant characteristic is selected from the group 
consisting of: fruit ripening; oil biosynthesis; pigment biosynthesis; seed formation; 
starch metabolism; salt tolerance; cold/frost tolerance; drought tolerance; tolerance 
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to anaerobic conditions; protein content; carbohydrate content (including sugars and 
starches); amino acid content; and fatty acid content. 

Plant retroelement sequences comprising specialized signals, and constructs 
which comprise the sequences are provided, as are cells, seeds, embryos and plants 
comprising them. Preferred are isolated nucleic acid molecules, comprisng a nucleic 
acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 95% identity to SEQ ID NO 2; 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which is SEQ ID NO 2; 

(c) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 4; and 

(d) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); and a nucleic acid sequence of (c). 

Plant retroelements as described above, which further comprise at least one nucleic 
acid sequence which encodes at least one agronomically-significant characteristic are 
preferred. More preferred are those methods wherein the agronomically-significant 
characteristic is selected from the group consisting of: male sterility; self- 
incompatibility; foreign organism resistance; improved biosynthetic pathways; 
environmental tolerance; photosynthetic pathways; and nutrient content and those 
wherein the agronomically significant characteristic is selected from the group 
consisting of: fruit ripening; oil biosynthesis; pigment biosynthesis; seed formation; 
starch metabolism; salt tolerance; cold/frost tolerance; drought tolerance; tolerance 
to anaerobic conditions; protein content; carbohydrate content (including sugars and 
starches); amino acid content; and fatty acid content. 
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Preferred are plant retroviral particles comprising an isolated retroelement as 
described, and seeds and plants comprising the retroelements as described. More 
preferred plants include soybean; maize; sugar cane; beet; tobacco; wheat; barley; 
poppy; rape; sunflower; alfalfa; sorghum; rose; carnation; gerbera; carrot; tomato; 
lettuce; chicory; pepper; melon; cabbage; oat; rye; cotton; flax; potato; pine; walnut; 
citrus (including oranges, grapefruit etc.); hemp; oak; rice; petunia; orchids; 
Arabidopsis; broccoli; cauliflower; brussel sprouts; onion; garlic; leek; squash; 
pumpkin; celery; pea; bean (including various legumes); strawberries; grapes; 
apples; pears; peaches; banana; palm; cocoa; cucumber; pineapple; apricot; plum; 
sugar beet; lawn grasses; maple; triticale; safflower; peanut; and olive. Soybean is 
most preferred. 

Also provided are methods to transfer nucleic acid into a plant cell, comprising 
contacting a plant retroelement as described with at least one plant cell under 
conditions sufficient to allow said plant retroelement to enter said cell. Methods to 
impart agronomically-significant characteristics to a plant, comprising contacting a 
plant retroelement as described with at least one plant cell under conditions sufficient 
to allow said plant retroelement to enter said cell are also preferred. Those methods 
wherein the plant retroelement is contacted with said cell via a plant retroviral 
particle described herein are preferred. 

Plant retroviruses are also provided. In particular, plant retroviral particles 
comprising a plant-derived retrovirus envelope protein are provided. Plant retroviral 
particles comprising a plant-derived retrovirus envelope protein and which further 
comprise a plant retroviral protein selected from the group consisting of: plant- 
derived integrase; plant derived reverse transcriptase; plant-derived gag; and plant- 
derived RNAseH are preferred. 

Plant retroviral particles comprising specialized retroviral proteins, and cells, 
seeds, embryos and plants which comprise the retroviral particles are provided. 
Preferred are isolated retroviral particles comprising a plant retroviral protein 
encoded by a nucleic acid sequence selected from the group consisting of: 
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(a) a nucleic acid sequence comprising (i) a nucleic acid sequence which encodes 
at least one plant retroviral envelope protein, and (ii) a nucleic acid sequence which 
has more than 60% identity to a nucleic acid sequence selected from the group 
consisting of: SEQ ID NO 9; SEQ ID NO 1 1 ; SEQ ID NO 1 5; SEQ ID NO 26; SEQ 
ID NO 27; SEQ ID NO 28; SEQ ID NO 29; SEQ ID NO 30; and SEQ ID NO 31, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence encoded by a 
nucleic acid sequence (a); 

(c) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence encoded by a nucleic acid sequence of (a); and 

(d) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); and a nucleic acid sequence of (c). 

In particular, there are provided plant retroviral particles, wherein said nucleic acid 
sequence as described in (a) comprises a plant envelope nucleic acid specifically 
mentioned in claim 6 is preferred. Those particles which further comprise at least 
one nucleic acid sequence which encodes at least one agronomically-significant 
characteristic are preferred. 

Also provided are methods to transfer nucleic acid into a plant cell, comprising 
contacting a plant retroviral particle as described above to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. More 
preferred are methods to impart agronomically-significant characteristics to a plant, 
comprising contacting a plant retroviral particle as described to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. 
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More preferred are isolated retroviral particles comprising a plant retroviral 
protein encoded by a nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 80% identity to a nucleic acid 
sequence selected from the group consisting of: SEQ ID NO 9; SEQ ID NO 1 1; and 
SEQ ID NO 1 5, wherein said identity can be determined using the DNAsis computer 
program and default parameters; 

(b) a nucleic acid sequence which encodes a nucleic acid selected from the group 
consisting of: SEQ ID NO 9; SEQ ID NO 1 1; and SEQ ID NO 15; 

(c) a nucleic acid sequence which encodes an amino acid sequence encoded by a 
nucleic acid sequence selected from the group consisting of: a nucleic acid sequence 
of (a); and a nucleic acid sequence of (b); 

(d) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence encoded by a nucleic acid selected from the group consisting of: a nucleic 
acid sequence of (a); and a nucleic acid sequence of (b); and 

(e) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); and a nucleic acid sequence of (d). 

Nucleic acids as above, which further comprises at least one nucleic acid sequence 
which encodes at least one agronomically-significant characteristic are preferred. 
More preferred are those nucleic acids wherein the agronomically-significant 
characteristic is selected from the group consisting of: male sterility; self- 
incompatibility; foreign organism resistance; improved biosynthetic pathways; 
environmental tolerance; photosynthetic pathways; and nutrient content. Also more 
preferred are those isolated nucleic acid molecule as described, wherein the 
agronomically significant characteristic is selected from the group consisting of: 
fruit ripening; oil biosynthesis; pigment biosynthesis; seed formation; starch 
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metabolism; salt tolerance; cold/frost tolerance; drought tolerance; tolerance to 
anaerobic conditions; protein content; carbohydrate content (including sugars and 
starches); amino acid content; and fatty acid content. 

Also provided are methods to transfer nucleic acid into a plant cell, comprising 
contacting a plant retroviral particle as described above to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. More 
preferred are methods to impart agronomically-significant characteristics to a plant, 
comprising contacting a plant retroviral particle as described to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. 

Also preferred are isolated retroviral particles comprising a plant retroviral 
protein encoded by a nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 60% identity to a nucleic acid 
sequence selected from the group consisting of SEQ ID NO 9; SEQ ID NO 1 1; SEQ 
ID NO 15; SEQ ID NO 26; SEQ ID NO 27; SEQ ID NO 28; SEQ ID NO 29; SEQ 
ID NO 30; and SEQ ID NO 31, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which encodes a nucleic acid selected from the group 
consisting of: SEQ ID NO 9; SEQ ID NO 1 1; SEQ ID NO 15; SEQ ID NO 26; SEQ 
ID NO 27; SEQ ID NO 28; SEQ ID NO 29; SEQ ID NO 30; and SEQ ID NO 3 1 ; 

(c) a nucleic acid sequence which encodes an amino acid sequence encoded by a 
nucleic acid sequence selected from the group consisting of: a nucleic acid sequence 
of (a); and a nucleic acid sequence of (b); 
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(d) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence encoded by a nucleic acid selected from the group consisting of: a nucleic 
acid sequence of (a); and a nucleic acid sequence of (b); and 

(e) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); and a nucleic acid sequence of (d). 

Plant retroviral particles as described above, which further comprises an envelope- 
encoding nucleic acid sequence specifically described herein are preferred. Preferred 
are those retroviral particles which further comprise at least one nucleic acid 
sequence which encodes at least one agronomically-significant characteristic. 

Also provided are methods to transfer nucleic acid into a plant cell, comprising 
contacting a plant retroviral particle as described above to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. More 
preferred are methods to impart agronomically-significant characteristics to a plant, 
comprising contacting a plant retroviral particle as described to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. 

Also provided by the present invention are isolated nucleic acid molecules, 
wherein said nucleic acid molecule encodes at least a portion of a plant retroelement 
reverse transcriptase and comprises a nucleic acid sequence selected from the group 
consisting of: 

(a) a nucleic acid sequence having more than 85% identity to a 
nucleic acid sequence selected from the group consisting of even- 
numbered SEQ ID NOs inclusive from SEQ ID NO 42 to SEQ ID 
NO 164, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 
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(b) a nucleic acid sequence which encodes an amino acid sequence 
having more than 85% identity to an amino acid sequence selected 
from the group consisting of odd-numbered SEQ ID NOs inclusive 
from SEQ ID NO 43 through SEQ ID NO 165, wherein said identity 
can be determined using the DNAsis computer program and default 
parameters; 

(c) a nucleic acid sequence which encodes an allelic variant of a 
nucleic acid sequence selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b). 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b). 

Seeds and plants comprising the nucleic acid molecules are also provided, as are 
nucleic acids as described which comprise gag, pol and env genes and which 
comprises adenine-thymidine-guanidine as the gag gene start codon. Moreover, 
those nucleic acids which further comprises SEQ ID NO 5 are also provided. Also 
provided by the present invention are isolated nucleic acid molecules described, 
wherein said nucleic acid molecule encodes at least a portion of a plant envelope 
sequence and comprises a nucleic acid sequence selected from the group consisting 
of: 

(a) a nucleic acid sequence which has more than 90% identity to 
SEQ ID NO 5, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
which has greater than 85% identity to SEQ ID NO 6, wherein said 
identity can be determined using the DNAsis computer program and 
default parameters; 
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(c) a nucleic acid sequence which encodes an allelic variant of SEQ 
ID NO 5; and 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b); and a nucleic acid 
sequence of c). 

Plant cells comprising this embodiment are also provided. Methods to impart 
agronomically-significant characteristics to at least one plant cell, comprising: 

contacting a nucleic acid molecule described to at least one plant cell 
under conditions sufficient to allow at least one agronomically- 
significant nucleic acid molecule to enter said cell. 

Also part of the present invention are isolated nucleic acid molecules, 
wherein said nucleic acid molecule encodes at least a portion of a plant retroelement 
reverse transcriptase and comprises a nucleic acid sequence selected from the group 
consisting of: 

(a) a nucleic acid sequence having more than 95% identity to a 
nucleic acid sequence selected from the group consisting of even- 
numbered SEQ ID NOs inclusive from SEQ ID NO 42 to SEQ ID 
NO 164, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
having more than 95% identity to an amino acid sequence selected 
from the group consisting of odd-numbered SEQ ID NOs inclusive 
from SEQ ID NO 43 through SEQ ID NO 165, wherein said identity 
can be determined using the DNAsis computer program and default 
parameters; 
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(c) a nucleic acid sequence which encodes an allelic variant of a 
nucleic acid sequence selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b). 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b). 

Seeds and plants comprising the nucleic acid molecules are also provided, as are 
nucleic acids as described which comprise gag, pol and env genes and which 
comprises adenine-thymidine-guanidine as the gag gene start codon. Moreover, 
those nucleic acids which further comprises SEQ ID NO 5 are also provided. 
Methods to impart agronomically-significant characteristics to at least one plant cell, 
comprising: 

contacting a nucleic acid molecule described to at least one plant cell 
under conditions sufficient to allow at least one agronomically- 
significant nucleic acid molecule to enter said cell. 

Also provided are isolated nucleic acid molecule, wherein said nucleic acid 
molecule encodes at least a portion of a plant retroelement reverse transcriptase and 
comprises a nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence selected from the group consisting of 
even-numbered SEQ ID NOs inclusive from SEQ ID NO 42 to SEQ 
ID NO 164, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
selected from the group consisting of odd-numbered SEQ ID NOs 
inclusive from SEQ ID NO 43 through SEQ ID NO 165, wherein said 
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identity can be determined using the DNAsis computer program and 
default parameters; 

(c) a nucleic acid sequence which encodes an allelic variant of a 
nucleic acid sequence selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b). 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b). 

Seeds and plants comprising the nucleic acid molecules are also provided, as are 
nucleic acids as described which comprise gag, pol and env genes and which 
comprises adenine-thymidine-guanidine as the gag gene start codon. Moreover, 
those nucleic acids which further comprises SEQ ID NO 5 are also provided. 
Methods to impart agronomically-significant characteristics to at least one plant cell, 
comprising: 

contacting a nucleic acid molecule described to at least one plant cell 
under conditions sufficient to allow at least one agronomically- 
significant nucleic acid molecule to enter said cell. 

Nucleic acid molecules of the present invention which further comprise at 
least one nucleic acid sequence which encodes at least one agronomically-significant 
characteristic are also provided. Those nucleic acid molecules wherein the 
agronomically-significant characteristic is selected from the group consisting of: 
male sterility; self-incompatibility; foreign organism resistance; improved 
biosynthetic pathways; environmental tolerance; photosynthetic pathways; and 
nutrient content are preferred. Also preferred are those nucleic acid molecules 
wherein the agronomically significant characteristic is selected from the group 
consisting of: fruit ripening; oil biosynthesis; pigment biosynthesis; seed formation; 
starch metabolism; salt tolerance; cold/frost tolerance; drought tolerance; tolerance 
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to anaerobic conditions; protein content; carbohydrate content (including sugars and 
starches); amino acid content; and fatty acid content. 

Also provided are isolated plant retroviral particles comprising a nucleic acid 
molecule of the present invention. 

Preferred plants are selected from the group consisting of: soybean; maize; 
sugar cane; beet; tobacco; wheat; barley; poppy; rape; sunflower; alfalfa; sorghum; 
rose; carnation; gerbera; carrot; tomato; lettuce; chicory; pepper; melon; cabbage; 
oat; rye; cotton; flax; potato; pine; walnut; citrus (including oranges, grapefruit etc.); 
hemp; oak; rice; petunia; orchids; Arabidopsis; broccoli; cauliflower; brussel sprouts; 
onion; garlic; leek; squash; pumpkin; celery; pea; bean (including various legumes); 
strawberries; grapes; apples; pears; peaches; banana; palm; cocoa; cucumber; 
pineapple; apricot; plum; sugar beet; lawn grasses; maple; triticale; safflower; 
peanut; and olive. 

In the present invention, it is preferred that the nucleic acid sequences are 
transmissible to either all plants, or to a limited set of plants, such as a species. For 
instance, plant viruses in general only infect a narrow host range or maybe infect a 
single species, and the present compounds may be genetically engineered to be 
similar. However, if a broad host range is desirable, those features which cause 
specificity can be removed or overridden by the feature of broad transmissibility. 
The present invention is drawn to both these embodiments, as well as other 
variations. 

"Allelic variant" is meant to refer to a full length gene or partial sequence of a full 
length gene that occurs at essentially the same locus (or loci) as the referent 
sequence, but which, due to natural variations caused by, for example, mutation or 
recombination, has a similar but not identical sequence. Allelic variants typically 
encode proteins having similar activity to that of the protein encoded by the gene to 
which they are being compared. Allelic variants can also comprise alterations in the 
5' or 3 ! untranslated regions of the gene (e.g., in regulatory control regions). 
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5 By "agronomically-significant" it is meant any trait of a plant which is recognized 

by members of the agricultural industry as desirable. 

"Fragment" is meant to refer to any subset of the referent nucleic acid molecule. 

10 By "plant" it is meant one or more plant seed, plant embryo, plant part or whole 

plant. The plant may be an angiosperm (monocot or dicot), gymnosperm, man-made 
or naturally-occurring. 

By "proteins" it is meant any compounds which comprise amino acids, including 
15 peptides, polypeptides, fusion proteins, etc. 

Moreover, for the purposes of the present invention, the term "a" or "an" 

0 entity refers to one or more of that entity; for example, "a protein" or "a nucleic acid 
m molecule" refers to one or more of those compounds or at least one compound. As 
10 such, the terms "a" (or "an"), "one or more" and "at least one" can be used 

interchangeably herein. It is also to be noted that the terms "comprising", 

P "including", and "having" can be used interchangeably. Furthermore, a compound 

1 "selected from the group consisting of refers to one or more of the compounds in 
y the list that follows, including mixtures (i.e., combinations) of two or more of the 
|5 compounds. According to the present invention, an isolated, or biologically pure, 
Jjj protein or nucleic acid molecule is a compound that has been removed from its 
q natural milieu. As such, "isolated" and "biologically pure" do not necessarily reflect 

the extent to which the compound has been purified. An isolated compound of the 
present invention can be obtained from its natural source, can be produced using 

30 molecular biology techniques or can be produced by chemical synthesis. Lastly, 

"more than" and "greater than" are interchangeable, and when used to modify a 
percent identity, ie. "more than 90% identity", mean any increment to 100%, so long 
as the increment were greater than the percentage specifically named. In the 
example of "more than 90% identity", the term would include, among all other 

35 possibilities, 90.00001, 93.7, 98.04 and 99. 0827 and 100%. 
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The following is a summary of the sequence listing, as a convenient reference, 





SEQ ID NO 


Description 




1 


specialized primer binding site version 1 




2 


specialized primer binding site version 2 




3 


specialized polypurine tract 


10 


4 


targeting sequence 




5 


NA generic envelope 




6 


AA of 5 




7 


NA of generic gag 




8 


AA of 7 


15 


9 


NA of generic integrase 




10 


AAof 9 




11 


NA of generic reverse transcriptase 




12 


AAof 11 




13 


generic protease 


s 


14 


AAof 13 


= 11 


15 


generic RNAseH 


M ; 


16 


AAof 15 


s ,v " 


17 


generic retroelement 




18 


AAof 17 


m 


19 


NA calypso 1-1 




20 


NA calypso 1-2 




21 


NA calypso 1-3 


W 


22 


NA calypso 2-1 




23 


NA calypso 2-2 


30 


24 


NA athila env 




25 


NA cyclops env 




26 


NA athila integrase 




27 


NA athila reverse transcriptase 




28 


NA athila RNAseH 


35 


29 


NA cyclops reverse transcriptase 




30 


NA cyclops RNAseH 




27 



5 


31 


NA cyclops integrase 




32 


NA calypso envelope 




33 


NA calypso RNAseH 




34 


NA calypso reverse transcriptase 




35 


NA calypso integrase 


10 


36 


Primer binding site A 




37 


Primer binding site B 




38 


Pnmftr binding Qitp minimum 




39 


Primer binding site extended 




40 


polypurine tract A 


15 


41 


polypurine tract B 




42 


Tobaccol DNA 




43 


Tobaccol AA 




44 


Tobacco2-2 DNA 


;ii I 


45 


Tobacco2-2 AA 


i 


46 


Tobacco4-l DNA 




47 


Tobacco4-l AA 




48 


Tobaccos -3 DNA 




49 


TobaccoB -3 AA 




50 


Ricel DNA 


fej 


51 


Ricel AA 


IL| 


52 


Rice2-10 DNA 


Li 


53 


Rice2-10 AA 




54 


Rice2-17 DNA 




55 


Rice2-17 AA 


30 


56 


Rice5-2 DNA 




57 


Rice5-2 AA 




58 


Barley2-4 DNA 




59 


Barley2-4 AA 




60 


Barley2-12 DNA 


35 


61 


Barley2-12 AA 



28 



5 


62 


Barley2-19 DNA 




63 


Barley2-19 AA 




64 


Bar ley 7 DNA 




65 


Barley7 AA 




66 


Oat6-l DNA 


10 


67 


Oat6-l AA 




68 


Oat6-7 DNA 




69 


Oat6-7 AA 




70 


Oat6-8 DNA 




71 


Oat6-8 AA 


15 


72 


Rye5-2 DNA 




73 


Rye5 -2 AA 




74 


Rye3-4 DNA 




75 


Rye3-4 AA 




76 


Rye4-4 DNA 


|U 


77 


Rye4-4 AA 




78 


Rye5-4 DNA 




79 


Rye5 -4 AA 




80 


Wheat3-1 DNA 




81 


Wheat3-1 AA 


8 


82 


Wheats -3 DNA 




83 


Wheats -3 AA 




84 


Wheat8-2 DNA 




85 


Wheat8-2 AA 




86 


Wheat8-5 DNA 


30 


87 


Wheat8-5 AA 




88 


Wheat8-ll DNA 




89 


Wheat8-ll AA 




90 


Cotton5-3 DNA 




91 


Cotton5-3 AA 



29 



92 


Cotton8-6 DNA 


93 


Cotton8-6 AA 


94 


Cotton8-7 DNA 


95 


Cotton8-7 AA 


96 


Tomato4-4 DNA 


97 


Tomato4-4 AA 


98 


Tomato4-10 DNA 


99 


Tomato4-10 AA 


100 


Tomatol0-4 DNA 


101 


Tomatol0-4 AA 


102 


Tomatol0-16 DNA 


103 


Tomatol0-16 AA 


104 


Potato5-1 DNA 


105 


Potato5-l AA 


106 


Potato8-3 DNA 


107 


Potato8-3 AA 


108 


Potato8-4 DNA 


109 


Potato8-4 AA 


110 


Potato8-5 DNA 


111 


Potato8-5 AA 


112 


Potato8-8 DNA 


113 


Potato8-8 AA 


114 


Potato8-10 DNA 


115 


Potato8-10 AA 


116 


Sycamore2-3 DNA 


117 


Sycamore2-3 AA 


118 


Sycamore4-2DNA 


119 


Sycamore4-2 AA 


120 


Sycamore4-3 DNA 


121 


Sycamore4-3 AA 



5 


122 


Sycamore4-7 DNA 




123 


Sycainore4-7 AA 




124 


Sorghum4-3 DNA 




125 


Sorghum4-3 AA 




126 


SorghumS -2 DNA 


10 


127 


SorghumS -2 AA 




128 


SorghumS -4 DNA 




129 


SorghumS -4 AA 




130 


SorghumS -5 DNA 




131 


SorghumS -5 AA 


15 


132 


SorghumS -6 DNA 




133 


SorghumS -6 AA 




134 


SorghumS -8 DNA 




135 


SorghumS -8 AA 


; lff s 


136 


L85 Soybean8-2 DNA 




137 


L85 Soybean8-2 AA 


CI 


138 


L85 Soybean2 DNA 


~* 


139 


L85 Soybean2 AA 




140 


L85 Soybean9-2 DNA 




141 


L85 Soybean9-2 AA 


fl 


142 


L85 Soybean9-3 DNA 




143 


L85 Soybean9-3 AA 




144 


L85 Soybean9-6 DNA 




145 


L85 Soybean9-6 AA 




146 


Williams Soybean8-2 DNA 


30 


147 


Williams Soybean8-2 AA 




148 


Williams Soybean8-3 DNA 




149 


Williams Soybeans -3 AA 




150 


Williams Soybean2 DNA 




151 


Williams Soybean2 AA 



31 
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Williams Soybean3 DNA 
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Williams Soybean3 AA 
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Hark Soybean2 DNA 
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Hark Soybean2 AA 
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Hark Soybeans -1 DNA 
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Hark Soybean5-l AA 
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Hark Soybeans DNA 
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Hark Soybeans AA 
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Peal DNA 


161 


Peal AA 
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Pea8-1 DNA 
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Pea8-1 AA 
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Pea9-1 DNA 
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Pea9-1 AA 
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DETAILED DESCRIPTION OF THE T1WENTTON 



In one embodiment, there are provided isolated nucleic acid molecules, 
wherein said nucleic acid molecules encode at least a portion of a plant retroelement 
and comprises a nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which is a plant retroelement primer binding site and 
which has more than 95% identity to SEQ ID NO 2, wherein said identity can be 
determined using the DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which is at least a portion of a plant retroelement 
envelope sequence and which has more than 50% identity to SEQ ID NO 5, wherein 
said identity can be determined using the DNAsis computer program and default 
parameters; 

(c) a nucleic acid sequence which is at least a portion of a plant retroelement gag 
sequence and which has more than 50% identity to SEQ ID NO 7, wherein said 
identity can be determined using the DNAsis computer program and default 
parameters; 

(d) a nucleic acid sequence which is at least a portion of a plant retroelement 
integrase sequence and which has more than 70% identity to SEQ ID NO 9, wherein 
said identity can be determined using the DNAsis computer program and default 
parameters; 

(e) a nucleic acid sequence which is at least a portion of a plant retroelement 
reverse transcriptase sequence and which has more than 70% identity to SEQ ID NO 
1 1 , wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(f) a nucleic acid sequence which is at least a portion of a plant retroelement 
protease sequence and which has more than 50% identity to SEQ ID NO 13, wherein 
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said identity can be determined using the DNAsis computer program and default 
parameters; 

(g) a nucleic acid sequence which is at least a portion of a plant retroelement 
RNAseH sequence and which has more than 70% identity to SEQ ID NO 15, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(h) a nucleic acid sequence which is at least a portion of a plant retroelement 
sequence and which has more than 50% identity to SEQ ID NO 17, wherein said 
identity can be determined using the DNAsis computer program and default 
parameters; 

(i) a nucleic acid sequence which is selected from the group consisting of: SEQ ID 
NO 2; SEQ ID NO 5; SEQ ID NO 7; SEQ ID NO 9; SEQ ID NO 1 1 ; SEQ ID NO 13; 
SEQ ID NO 15; and SEQ ID NO 17. 

(j) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement envelope sequence and has more than 30% identity 
to SEQ ID NO 6, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(k) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement gag sequence and has more than 30% identity to 
SEQ ID NO 8, wherein said identity can be determined using the DNAsis computer 
program and default parameters; 

(1) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement integrase sequence and has more than 75% identity 
to SEQ ID NO 10, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 
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(m) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement reverse transcriptase sequence and has more than 
79% identity to SEQ ID NO 12, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 

(n) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement protease sequence and has more than 55% identity 
to SEQ ID NO 14, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(o) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement RNAseH sequence and has more than 90% identity 
to SEQ ID NO 16, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(p) a nucleic acid sequence which encodes an amino acid sequence which is at least 
a portion of a plant retroelement sequence and has more than 40% identity to SEQ 
ID NO 18, wherein said identity can be determined using the DNAsis computer 
program; 

(q) a nucleic acid sequence which encodes an amino acid sequence selected from the 
group consisting of: SEQ ID NO 4; SEQ ID NO 6; SEQ ID NO 8; SEQ ID NO 10; 
SEQ ID NO 12; SEQ ID NO 14; SEQ ID NO 16; and SEQ ID NO 18; 

(r) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence selected from the group consisting of: SEQ ID NO 4; SEQ ID NO 6; SEQ 
ID NO 8; SEQ ID NO 10; SEQ ID NO 12; SEQ ID NO 14; SEQ ID NO 16; and 
SEQ ID NO 18; and 

(s) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); a nucleic acid 



35 



sequence of (e); a nucleic acid sequence of (f); a nucleic acid sequence of (g); a 
nucleic acid sequence of (h); a nucleic acid sequence of (i); a nucleic acid sequence 
of (j); a nucleic acid sequence of (k); a nucleic acid sequence of (1); a nucleic acid 
sequence of (m); a nucleic acid sequence of (n); a nucleic acid sequence of (o); a 
nucleic acid sequence of (p); a nucleic acid sequence of (q); and a nucleic acid 
sequence of (r). 

Seeds and plants comprising a nucleic acid as above are particularly provided. 
Nucleic acid molecules as above which comprise gag, pol and env genes and which 
comprise adenine-thymidine-guanidine as the gag gene start codon are also 
particularly provided. Those which comprise gag, pol and env genes, the adenine- 
thymidine-guanidine as the gag gene start codon, and which further comprises SEQ 
ID NO 4 are also provided. 

Included within the scope of the present invention, with particular regard to 
the nucleic acids above, are allelic variants, degenerate sequences and homologues. 
The present invention also includes variants due to laboratory manipulation, such as, 
but not limited to, variants produced during polymerase chain reaction amplification 
or site directed mutagenesis. It is also well known that there is a substantial amount 
of redundancy in the various codons which code for specific amino acids. Therefore, 
this invention is also directed to those nucleic acid sequences which contain 
alternative codons which code for the eventual translation of the identical amino 
acid. Also included within the scope of this invention are mutations either in the 
nucleic acid sequence or the translated protein which do not substantially alter the 
ultimate physical properties of the expressed protein. For example, substitution of 
valine for leucine, arginine for lysine, or asparagine for glutamine may not cause a 
change in functionality of the polypeptide. Lastly, a nucleic acid sequence 
homologous to the exemplified nucleic acid molecules (or allelic variants or 
degenerates thereof) will have at least 85%, preferably 90%, and most preferably 
95% sequence identity with a nucleic acid molecule in the sequence listing. 
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It is known in the art that there are commercially available computer 
programs for determining the degree of similarity between two nucleic acid 
sequences. These computer programs include various known methods to determine 
the percentage identity and the number and length of gaps between hybrid nucleic 
acid molecules. Preferred methods to determine the percent identity among amino 
acid sequences and also among nucleic acid sequences include analysis using one or 
more of the commercially available computer programs designed to compare and 
analyze nucleic acid or amino acid sequences. These computer programs include, 
but are not limited to, GCG™ (available from Genetics Computer Group, Madison, 
WI), DNAsis™ (available from Hitachi Software, San Bruno, CA) and MacVector™ 
(available from the Eastman Kodak Company, New Haven, CT). A preferred 
method to determine percent identity among amino acid sequences and also among 
nucleic acid sequences includes using the Compare function by maximum matching 
within the program DNAsis Version 2.1 using default parameters. 

Knowing the nucleic acid sequences of the present invention allows one 
skilled in the art to, for example, (a) make copies of those nucleic acid molecules, 
(b) obtain nucleic acid molecules including at least a portion of such nucleic acid 
molecules (e.g., nucleic acid molecules including full-length genes, full-length 
coding regions, regulatory control sequences, truncated coding regions), and (c) 
obtain similar nucleic acid molecules from other species. Such nucleic acid 
molecules can be obtained in a variety of ways including screening appropriate 
expression libraries with antibodies of the present invention; traditional cloning 
techniques using oligonucleotide probes of the present invention to screen 
appropriate libraries of DNA; and PCR amplification of appropriate libraries or DNA 
using oligonucleotide primers of the present invention. Preferred libraries to screen 
or from which to amplify nucleic acid molecules include canine cDNA libraries as 
well as genomic DNA libraries. Similarly, preferred DNA sources to screen or from 
which to amplify nucleic acid molecules include adult cDNA and genomic DNA. 
Techniques to clone and amplify genes are disclosed, for example, in Sambrook et 
al., ibid. 
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Recombination constructs can be made using the starting materials above or 
with additional materials, using methods well-known in the art. In general, the 
sequences can be manipulated to have ligase-compatible ends, and incubated with 
ligase to generate full constructs. For example, restriction enzymes can be chosen 
on the basis of their ability to cut at an acceptable site in both sequence to be ligated, 
or a linker may be added to convert the sequence end(s) to ones that are compatible. 
The methods for conducting these types of molecular manipulations are well-known 
in the art, and are described in detail in Sambrook et al., Molecular Cloning. A 
Laboratory Manual (Cold Spring Harbor Laboratory Press, 1989) and Ausubel et al, 
Current Protocols in Molecular Biology (Greene Publishing Associates, Inc., 1993). 
The methods described herein according to Tinland et al, 91 Proc. Natl. Acad. 
Sci.USA 8000 (1994) can also be used. 

The present invention also includes nucleic acid molecules that are 
oligonucleotides capable of hybridizing, under stringent hybridization conditions, 
with complementary regions of other, preferably longer, nucleic acid molecules of 
the present invention. Oligonucleotides of the present invention can be RNA, DNA, 
or derivatives of either. The minimum size of such oligonucleotides is the size 
required for formation of a stable hybrid between an oligonucleotide and a 
complementary sequence on a nucleic acid molecule of the present invention. 
Minimal size characteristics are disclosed herein. The present invention includes 
oligonucleotides that can be used as, for example, probes to identify nucleic acid 
molecules, primers to produce nucleic acid molecules or therapeutic reagents. 
Stringent hybridization conditions are determined based on defined physical 
properties of the gene to which the nucleic acid molecule is being hybridized, and 
can be defined mathematically. Stringent hybridization conditions are those 
experimental parameters that allow an individual skilled in the art to identify 
significant similarities between heterologous nucleic acid molecules. These 
conditions are well known to those skilled in the art. See, for example, Sambrook, 
et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs 
Press, and Meinkoth, et al, 1984, Anal. Biochem. 138, 267-284. 
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Recombinant molecules of the present invention may also (a) contain 
secretory signals (i.e., signal segment nucleic acid sequences) to enable an expressed 
protein of the present invention to be secreted from the cell that produces the protein 
and/or (b) contain fusion sequences which lead to the expression of nucleic acid 
molecules of the present invention as fusion proteins. Recombinant molecules may 
also include intervening and/or untranslated sequences surrounding and/or within the 
nucleic acid sequences of nucleic acid molecules of the present invention. 

One embodiment of the present invention includes recombinant vectors, 
which include at least one isolated nucleic acid molecule of the present invention, 
inserted into any vector capable of delivering the nucleic acid molecule into a host 
cell. Such a vector contains heterologous nucleic acid sequences, that is nucleic acid 
sequences that are not naturally found adjacent to nucleic acid molecules of the 
present invention and that preferably are derived from a species other than the 
species from which the nucleic acid molecule(s) are derived. The vector can be 
either RNA or DNA, either prokaryotic or eukaryotic, and typically is a virus or a 
plasmid. Recombinant vectors can be used in the cloning, sequencing, and/or 
otherwise manipulation of nucleic acid molecules of the present invention. 

One type of recombinant vector, referred to herein as a recombinant 
molecule, comprises a nucleic acid molecule of the present invention operatively 
linked to an expression vector. The phrase operatively linked refers to insertion of 
a nucleic acid molecule into an expression vector in a manner such that the molecule 
is able to be expressed when transformed into a host cell As used herein, an 
expression vector is a DNA or RNA vector that is capable of transforming a host cell 
and of effecting expression of a specified nucleic acid molecule. Expression vectors 
can be either prokaryotic or eukaryotic, and are typically viruses or plasmids. 
Expression vectors of the present invention include any vectors that function (i.e., 
direct gene expression) in recombinant cells of the present invention, including in 
bacterial, fungal, endoparasite, insect, other animal, and plant cells. 
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In particular, expression vectors of the present invention contain regulatory 
sequences such as transcription control sequences, translation control sequences, 
origins of replication, and other regulatory sequences that are compatible with the 
recombinant cell and that control the expression of nucleic acid molecules of the 
present invention. In particular, recombinant molecules of the present invention 
include transcription control sequences. Transcription control sequences are 
sequences which control the initiation, elongation, and termination of transcription. 
Particularly important transcription control sequences are those which control 
transcription initiation, such as promoter, enhancer, operator and repressor 
sequences. Suitable transcription control sequences include any transcription control 
sequences that can function in at least one of the recombinant cells of the present 
invention. A variety of such transcription control sequences are known to those 
skilled in the art. Preferred transcription control sequences include those which 
function in bacterial, yeast, insect and mammalian cells, such as, but not limited to, 
tac, lac, trp, trc, oxy-pro, omp/lpp, rrnB, bacteriophage lambda (such as lambda pL 
and lambda pR and fusions that include such promoters), bacteriophage T7, T71ac, 
bacteriophage T3, bacteriophage SP6, bacteriophage SP01, metallothionein, alpha- 
mating factor, Pichia alcohol oxidase, alphavirus subgenomic promoters (such as 
Sindbis virus subgenomic promoters), antibiotic resistance gene, baculovirus, 
Heliothis zea insect virus, vaccinia virus, herpesvirus, raccoon poxvirus, other 
poxvirus, adenovirus, cytomegalovirus (such as intermediate early promoters), 
simian virus 40, retrovirus, actin, retroviral long terminal repeat, Rous sarcoma virus, 
heat shock, phosphate and nitrate transcription control sequences as well as other 
sequences capable of controlling gene expression in prokaryotic or eukaryotic cells. 
Additional suitable transcription control sequences include tissue-specific promoters 
and enhancers as well as lymphokine-inducible promoters (e.g., promoters inducible 
by interferons or interleukins). Transcription control sequences of the present 
invention can also include naturally occurring transcription control sequences 
naturally associated with plants. The present invention also comprises expression 
vectors comprising a nucleic acid molecule described herein. 
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For instance, the following promoters would be useful in early expression of 
the present sequences: Ogs4B (Tsuchiya et al., 36 Plant Cell Physiology 487 (1994); 
TA29 (Koltunow et al, 2 Plant Cell 1201 (1990); A3 & A9 (Paul et al., 19 Plant 
Molecular Biology 61 1 (1992). In order to then constitutively express the sequences 
described above, the construct optionally contains, for example, a 35 S promoter. 

Vectors which comprise the above sequences are within the scope of the 
present invention, as are plants transformed with the above sequences. Vectors may 
be obtained from various commercial sources, including Clontech Laboratories, Inc. 
(Palo Alto, CA), Stratagene (La Jolla, CA), Invitrogen (Carlsbad, CA), New England 
Biolabs (Beverly, MA) and Promega (Madison, WI). Preferred vectors are those 
which are capable of transferring the sequences disclosed herein into plant cells or 
plant parts. 

Recombinant DNA technologies can be used to improve expression of 
transformed nucleic acid molecules by manipulating, for example, the number of 
copies of the nucleic acid molecules within a host cell, the efficiency with which 
those nucleic acid molecules are transcribed, the efficiency with which the resultant 
transcripts are translated, and the efficiency of post-translational modifications. 
Recombinant techniques useful for increasing the expression of nucleic acid 
molecules of the present invention include, but are not limited to, operatively linking 
nucleic acid molecules to high-copy number plasmids, integration of the nucleic acid 
molecules into one or more host cell chromosomes, addition of vector stability 
sequences to plasmids, substitutions or modifications of transcription control signals 
(e.g., promoters, operators, enhancers), substitutions or modifications of translational 
control signals (e.g., ribosome binding sites, Shine-Dalgarno sequences), 
modification of nucleic acid molecules of the present invention to correspond to the 
codon usage of the host cell, deletion of sequences that destabilize transcripts, and 
use of control signals that temporally separate recombinant cell growth from 
recombinant enzyme production during fermentation. The activity of an expressed 
recombinant protein of the present invention may be improved by fragmenting, 
modifying, or derivatizing nucleic acid molecules encoding such a protein. 
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Nucleic acids of the present invention may be transferred to cells according 
to the methods of the present invention, as well as using any of the following well- 
known means: infective, vector-containing bacterial strains (such as Agrobacterium 
rhizogenes and Agrobacterium tumefaciens) according to ie. Zambryski, 43 Ann. 
Rev. PL Physiol. PL MoL BioL 465 (1992); pollen-tube transformation [Zhon-xun 
et al., 6 Plant Molec. Bio. 165 (1988)]; direct transformation of germinating seeds 
[Toepfer et al., 1 Plant Cell 133 (1989)]; polyethylene glycol or electroporation 
transformation [Christou et al. ? 84 Proc. Nat. Acad. Sci. 3662 (1987)]; and biolistic 
processes [Yang & Cristou, Particle Bombardment Technology for Gene Transfer 
(1994)]. 

The transformed cells may be induced to form transformed plants via 
organogenesis or embryogenesis, according to the procedures of Dixon Plant Cell 
Culture: A Practical Approach (IRL Press, Oxford 1987). 

Any seed, embryo, plant or plant part is amenable to the present techniques. 
Of course, the agronomically-significant seeds, embryos, plants or plant parts are 
preferred. Soybean; maize; sugar cane; beet; tobacco; wheat; barley; poppy; rape; 
sunflower; alfalfa; sorghum; rose; carnation; gerbera; carrot; tomato; lettuce; 
chicory; pepper; melon; cabbage; oat; rye; cotton; flax; potato; pine; walnut; citrus 
(including oranges, grapefruit etc.); hemp; oak; rice; petunia; orchids; Arabidopsis; 
broccoli; cauliflower; brussel sprouts; onion; garlic; leek; squash; pumpkin; celery; 
pea; bean (including various legumes); strawberries; grapes; apples; pears; peaches; 
banana; palm; cocoa; cucumber; pineapple; apricot; plum; sugar beet; lawn grasses; 
maple; triticale; safflower; peanut; and olive are among the preferred seeds, embryos, 
plants or plant parts. Particularly preferred are: soybean, tobacco and maize seeds, 
embryos, plants or plant parts. However, Arabidopsis seeds, embryos, plants or plant 
parts are also preferred, since it is an excellent system for study of plant genetics. 

Preferred are those genes or sequences which are agronomically significant. 
For example, genes encoding male sterility, foreign organism resistance (viruses or 
bacteria), including genes which produce bacterial endotoxins, such as bacillus 
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thurigiensis endotoxin, genes involved in specific biosynthetic pathways (eg. in fruit 
ripening, oil or pigment biosynthesis, seed formation, or carbohydrate metabolism), 
genes involved in environmental tolerance (eg. salt tolerance, lodging tolerance, 
cold/frost tolerance, drought tolerance, or tolerance to anaerobic conditions), or 
genes involved in nutrient content (eg. protein content, carbohydrate content, amino 
acid content, fatty acid content), genes involved in photosynthetic pathways, or genes 
involved in self-incompatibility. The choice of gene or sequence induced to 
recombine in the present invention is not limited. Examples of genes and how to 
obtain them are available through reference articles, books and supply catalogs, such 
as The Sourcebook (1-800-551-5291). Sambrook et al., Molecular Cloning. A 
Laboratory Manual (Cold Spring Harbor Laboratory Press, 1989) and Weising et al., 
22 Ann Rev. Gen. 421 (1988) contain a synthesis of the information that is well- 
known in this art. 

Plant envelope sequences and constructs which comprise the sequences are 
provided, as are cells, seeds, embryos and plants comprising them. Preferred are 
isolated nucleic acid molecules, wherein said nucleic acid molecules encode at least 
a portion of a plant envelope sequence and comprises a nucleic acid sequence 
selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 90% identity to SEQ ID NO 5, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes SEQ ID NO 5; 

(c) a nucleic acid sequence which encodes an amino acid sequence which has 
greater than 85% identity to SEQ ID NO 6, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 

(d) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 6; 



43 



(e) a nucleic acid sequence which encodes an allelic variant of SEQ ID NO 6; and 

(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Plant cells comprising an isolated nucleic acid molecule above are particularly 
preferred. Also preferred are plant envelope proteins comprising an amino acid 
sequence encoded by the above. Methods to impart agronomically-significant 
characteristics to at least one plant cell are also provided, comprising: contacting a 
plant envelope protein as described to at least one plant cell under conditions 
sufficient to allow a nucleic acid molecule to enter said cell, wherein said nucleic 
acid molecule encodes an agronomically-significant characteristic. 

Plant integrase sequences and constructs which comprise the sequences are 
provided, as are cells, seeds, embryos and plants comprising them. Preferred are 
isolated nucleic acid molecules, wherein said nucleic acid molecules encode at least 
a portion of a plant integrase sequence and comprises a nucleic acid sequence 
selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 90% identity to SEQ ID NO 9, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes SEQ ID NO 9; 

(c) a nucleic acid sequence which encodes an amino acid sequence which has 
greater than 85% identity to SEQ ID NO 10, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 

(d) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 10; 
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(e) a nucleic acid sequence which encodes an allelic variant of SEQ ID NO 10; and 

(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Plant cells comprising an isolated nucleic acid molecule above are particularly 
preferred. Also preferred are plant integrase proteins comprising an amino acid 
sequence encoded by the above. Methods to impart agronomically-significant 
characteristics to at least one plant cell are also provided, comprising: contacting a 
plant integrase protein as described to at least one plant cell under conditions 
sufficient to allow a nucleic acid molecule to enter said cell, wherein said nucleic 
acid molecule encodes an agronomically-significant characteristic. 

Plant reverse transcriptase sequences and constructs which comprise the 
sequences are provided, as are cells, seeds, embryos and plants comprising them. 
Preferred are isolated nucleic acid molecules, wherein said nucleic acid molecules 
encode at least a portion of a plant reverse transcriptase sequence and comprises a 
nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 90% identity to SEQ ID NO 11, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes SEQ ID NO 11; 

(c) a nucleic acid sequence which encodes an amino acid sequence which has 
greater than 85% identity to SEQ ID NO 12, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 

(d) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 12; 
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(e) a nucleic acid sequence which encodes an allelic variant of SEQ ID NO 12; and 

(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Plant cells comprising an isolated nucleic acid molecule above are particularly 
preferred. Also preferred are plant reverse transcriptase proteins comprising an 
amino acid sequence encoded by the above. Methods to impart agronomically- 
significant characteristics to at least one plant cell are also provided, comprising: 
contacting a plant reverse transcriptase protein as described to at least one plant cell 
under conditions sufficient to allow a nucleic acid molecule to enter said cell, 
wherein said nucleic acid molecule encodes an agronomically-significant 
characteristic. 

Plant RNAseH sequences and constructs which comprise the sequences are 
provided, as are cells, seeds, embryos and plants comprising them. Preferred are 
isolated nucleic acid molecules, wherein said nucleic acid molecules encode at least 
a portion of a plant RNAseH sequence and comprises a nucleic acid sequence 
selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 90% identity to SEQ ID NO 15, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes SEQ ID NO 15; 

(c) a nucleic acid sequence which encodes an amino acid sequence which has 
greater than 95% identity to SEQ ID NO 16, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 
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(d) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 16; 

(e) a nucleic acid sequence which encodes an allelic variant of SEQ ID NO 16; and 

(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Plant cells comprising an isolated nucleic acid molecule above are particularly 
preferred. Also preferred are plant RNAseH proteins comprising an amino acid 
sequence encoded by the above. Methods to impart agronomically-significant 
characteristics to at least one plant cell are also provided, comprising: contacting a 
plant RNAseH protein as described to at least one plant cell under conditions 
sufficient to allow a nucleic acid molecule to enter said cell, wherein said nucleic 
acid molecule encodes an agronomically-significant characteristic. 

Plant retroelement sequences and constructs which comprise the sequences 
are provided, as are cells, seeds, embryos and plants comprising them. Preferred are 
isolated nucleic acid molecules, wherein said nucleic acid molecules encode at least 
a portion of a plant retroelement sequence and comprises a nucleic acid sequence 
selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 95% identity to a nucleic acid 
sequence selected from the group consisting of: SEQ ID NO 2; SEQ ID NO 5; SEQ 
ID NO 7; SEQ ID NO 9; SEQ ID NO 1 1; SEQ ID NO 13; SEQ ID NO 15; and SEQ 
ID NO 17, wherein said identity can be determined using the DNAsis computer 
program and default parameters; 

(b) a nucleic acid sequence which is selected from the group consisting of: SEQ ID 
NO 2; SEQ ID NO 5; SEQ ID NO 7; SEQ ID NO 9; SEQ ED NO 11; SEQ ID NO 13; 
SEQ ID NO 15; and SEQ ID NO 17; 
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(c) a nucleic acid sequence which encodes an amino acid sequence which has more 
than 90% identity to an amino acid sequence selected from the group consisting of 
SEQ ID NO 4; SEQ ID NO 6; SEQ ID NO 8; SEQ ID NO 10; SEQ ID NO 12; SEQ 
ID NO 14; SEQ ID NO 16; SEQ ID NO 18, wherein said identity can be determined 
using the DNAsis computer program and default parameters; 

(d) a nucleic acid sequence which encodes an amino acid sequence selected from the 
group consisting of: SEQ ID NO 4; SEQ ID NO 6; SEQ ID NO 8; SEQ ID NO 10; 
SEQ ID NO 12; SEQ ID NO 14; SEQ ID NO 16; and SEQ ID NO 18; 

(e) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence selected from the group consisting of: SEQ ID NO 4; SEQ ID NO 6; SEQ 
ID NO 8; SEQ ID NO 10; SEQ ID NO 12; SEQ ID NO 14; SEQ ID NO 16; and 
SEQ ID NO 18; and 

(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (d); and a nucleic 
acid sequence of (e). 

Nucleic acid molecule as above, which further comprises at least one nucleic acid 
sequence which encodes at least one agronomically-significant characteristic are 
preferred. More preferred are those nucleic acid molecules as described wherein the 
agronomically-significant characteristic is selected from the group consisting of: 
male sterility; self-incompatibility; foreign organism resistance; improved 
biosynthetic pathways; environmental tolerance; photosynthetic pathways; and 
nutrient content. Also more preferred are those isolated nucleic acid molecule as 
described, wherein the agronomically significant characteristic is selected from the 
group consisting of: fruit ripening; oil biosynthesis; pigment biosynthesis; seed 
formation; starch metabolism; salt tolerance; cold/frost tolerance; drought tolerance; 
tolerance to anaerobic conditions; protein content; carbohydrate content (including 
sugars and starches); amino acid content; and fatty acid content. 
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Seeds and plants comprising a nucleic acid molecule as described are also preferred. 
More preferred are plants as described, wherein the plant is selected from the group 
consisting of: soybean; maize; sugar cane; beet; tobacco; wheat; barley; poppy; rape; 
sunflower; alfalfa; sorghum; rose; carnation; gerbera; carrot; tomato; lettuce; 
chicory; pepper; melon; cabbage; oat; rye; cotton; flax; potato; pine; walnut; citrus 
(including oranges, grapefruit etc.); hemp; oak; rice; petunia; orchids; Arabidopsis; 
broccoli; cauliflower; brussel sprouts; onion; garlic; leek; squash; pumpkin; celery; 
pea; bean (including various legumes); strawberries; grapes; apples; pears; peaches; 
banana; palm; cocoa; cucumber; pineapple; apricot; plum; sugar beet; lawn grasses; 
maple; triticale; safflower; peanut; and olive. Most preferred are plants as described 
which is a soybean plant. 

Plant retroelements comprising an amino acid sequence encoded by a nucleic acid 
sequence described are also provided. Plant cells comprising a nucleic acid molecule 
described herein, as well as plant retroviral proteins encoded by nucleic acid 
molecules described herein are provided. 

Moreover, methods to transfer nucleic acid into a plant cell, comprising contacting 
a nucleic acid molecule of the present invention with at least one plant cell under 
conditions sufficient to allow said nucleic acid molecule to enter at least one cell of 
said plant are provided. In particular there is provided, methods to impart 
agronomically-significant characteristics to at least one plant cell, comprising: 
contacting a plant retroelement of the present invention to at least one plant cell 
under conditions sufficient to allow a nucleic acid molecule to enter said cell, 
wherein said nucleic acid molecule encodes an agronomically-significant 
characteristic. Methods as described, wherein the agronomically-significant 
characteristic is selected from the group consisting of: male sterility; self- 
incompatibility; foreign organism resistance; improved biosynthetic pathways; 
environmental tolerance; photosynthetic pathways; and nutrient content are 
preferred, as are methods wherein the agronomically-significant characteristic is 
selected from the group consisting of: fruit ripening; oil biosynthesis; pigment 
biosynthesis; seed formation; starch metabolism; salt tolerance; cold/frost tolerance; 
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5 drought tolerance; tolerance to anaerobic conditions; protein content; carbohydrate 

content (including sugars and starches); amino acid content; and fatty acid content. 

Plant retroelement sequences comprising specialized signals, and constructs 
which comprise the sequences are provided, as are cells, seeds, embryos and plants 
10 comprising them. Preferred are isolated nucleic acid molecules, comprisng a nucleic 

acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 95% identity to SEQ ID NO 2; 
wherein said identity can be determined using the DNAsis computer program and 

15 default parameters; 

(b) a nucleic acid sequence which is SEQ ID NO 2; 

% (c) a nucleic acid sequence which encodes amino acid sequence SEQ ID NO 4; and 

|0 

7" (d) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 

13 from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 

"" of (b); and a nucleic acid sequence of (c). 

%i Plant retroelements as described above, which further comprise at least one nucleic 

ill acid sequence which encodes at least one agronomically-significant characteristic are 

g preferred. More preferred are those methods wherein the agronomically-significant 

characteristic is selected from the group consisting of: male sterility; self- 
incompatibility; foreign organism resistance; improved biosynthetic pathways; 
30 environmental tolerance; photosynthetic pathways; and nutrient content or those 

wherein the agronomically significant characteristic is selected from the group 
consisting of: fruit ripening; oil biosynthesis; pigment biosynthesis; seed formation; 
starch metabolism; salt tolerance; cold/frost tolerance; drought tolerance; tolerance 
to anaerobic conditions; protein content; carbohydrate content (including sugars and 
35 starches); amino acid content; and fatty acid content. 
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Preferred are plant retroviral particles comprising an isolated retroelement as 
described, and seeds and plants comprising the retroelements as described. More 
preferred plants include soybean; maize; sugar cane; beet; tobacco; wheat; barley; 
poppy; rape; sunflower; alfalfa; sorghum; rose; carnation; gerbera; carrot; tomato; 
lettuce; chicory; pepper; melon; cabbage; oat; rye; cotton; flax; potato; pine; walnut; 
citrus (including oranges, grapefruit etc.); hemp; oak; rice; petunia; orchids; 
Arabidopsis; broccoli; cauliflower; brussel sprouts; onion; garlic; leek; squash; 
pumpkin; celery; pea; bean (including various legumes); strawberries; grapes; 
apples; pears; peaches; banana; palm; cocoa; cucumber; pineapple; apricot; plum; 
sugar beet; lawn grasses; maple; triticale; safflower; peanut; and olive. Soybean is 
most preferred. 

Also provided are methods to transfer nucleic acid into a plant cell, comprising 
contacting a plant retroelement as described with at least one plant cell under 
conditions sufficient to allow said plant retroelement to enter said cell. Methods to 
impart agronomically-significant characteristics to a plant, comprising contacting a 
plant retroelement as described with at least one plant cell under conditions sufficient 
to allow said plant retroelement to enter said cell are also preferred. Those methods 
wherein the plant retroelement is contacted with said cell via a plant retroviral 
particle described herein are preferred. 

Plant retroviruses are also provided. In particular, plant retroviral particles 
comprising a plant-derived retrovirus envelope protein are provided. Plant retroviral 
particles comprising a plant-derived retrovirus envelope protein and which further 
comprise a plant retroviral protein selected from the group consisting of: plant- 
derived integrase; plant derived reverse transcriptase; plant-derived gag; and plant- 
derived RNAseH are preferred. 

Plant retroviral particles comprising specialized retroviral proteins, and cells, 
seeds, embryos and plants which comprise the retroviral particles are provided. 
Preferred are isolated retroviral particles comprising a plant retroviral protein 
encoded by a nucleic acid sequence selected from the group consisting of: 
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(a) a nucleic acid sequence comprising (i) a nucleic acid sequence which encodes 
at least one plant retroviral envelope protein, and (ii) a nucleic acid sequence which 
has more than 60% identity to a nucleic acid sequence selected from the group 
consisting of: SEQ ID NO 9; SEQ ID NO 1 1; SEQ ID NO 15; SEQ ID NO 26; SEQ 
ID NO 27; SEQ ID NO 28; SEQ ID NO 29; SEQ ID NO 30; and SEQ ID NO 3 1, 
wherein said identity can be determined using the DNAsis computer program and 
default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence encoded by a 
nucleic acid sequence (a); 

(c) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence encoded by a nucleic acid sequence of (a); and 

(d) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); and a nucleic acid sequence of (c). 

In particular, there are provided plant retroviral particles, wherein said nucleic acid 
sequence as described in (a) comprises a plant envelope nucleic acid specifically 
mentioned in claim 6 is preferred. Those particles which further comprise at least 
one nucleic acid sequence which encodes at least one agronomically-significant 
characteristic are preferred. 

Also provided are methods to transfer nucleic acid into a plant cell, comprising 
contacting a plant retroviral particle as described above to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. More 
preferred are methods to impart agronomically-significant characteristics to a plant, 
comprising contacting a plant retroviral particle as described to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. 
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More preferred are isolated retroviral particles comprising a plant retroviral 
protein encoded by a nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 80% identity to a nucleic acid 
sequence selected from the group consisting of: SEQ ID NO 9; SEQ ID NO 11; and 
SEQ ID NO 1 5, wherein said identity can be determined using the DNAsis computer 
program and default parameters; 

(b) a nucleic acid sequence which encodes a nucleic acid selected from the group 
consisting of: SEQ ID NO 9; SEQ ID NO 1 1; and SEQ ID NO 15; 

(c) a nucleic acid sequence which encodes an amino acid sequence encoded by a 
nucleic acid sequence selected from the group consisting of: a nucleic acid sequence 
of (a); and a nucleic acid sequence of (b); 

(d) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence encoded by a nucleic acid selected from the group consisting of: a nucleic 
acid sequence of (a); and a nucleic acid sequence of (b); and 

(e) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); and a nucleic acid sequence of (d). 

Nucleic acids as above, which further comprises at least one nucleic acid sequence 
which encodes at least one agronomically-significant characteristic are preferred. 
More preferred are those nucleic acids wherein the agronomically-significant 
characteristic is selected from the group consisting of: male sterility; self- 
incompatibility; foreign organism resistance; improved biosynthetic pathways; 
environmental tolerance; photosynthetic pathways; and nutrient content, or wherein 
the agronomically significant characteristic is selected from the group consisting of: 
fruit ripening; oil biosynthesis; pigment biosynthesis; seed formation; starch 
metabolism; salt tolerance; cold/frost tolerance; drought tolerance; tolerance to 
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anaerobic conditions; protein content; carbohydrate content (including sugars and 
starches); amino acid content; and fatty acid content. 

Also provided are methods to transfer nucleic acid into a plant cell, comprising 
contacting a plant retroviral particle as described above to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. More 
preferred are methods to impart agronomically-significant characteristics to a plant, 
comprising contacting a plant retroviral particle as described to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. 

Also preferred are isolated retroviral particles comprising a plant retroviral 
protein encoded by a nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 60% identity to a nucleic acid 
sequence selected from the group consisting of SEQ ID NO 9; SEQ ID NO 1 1 ; SEQ 
ID NO 15; SEQ ID NO 26; SEQ ID NO 27; SEQ ID NO 28; SEQ ID NO 29; SEQ 
ID NO 30; and SEQ ID NO 31, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which encodes a nucleic acid selected from the group 
consisting of: SEQ ID NO 9; SEQ ID NO 1 1 ; SEQ ID NO 1 5; SEQ ID NO 26; SEQ 
ID NO 27; SEQ ID NO 28; SEQ ID NO 29; SEQ ID NO 30; and SEQ ID NO 3 1; 

(c) a nucleic acid sequence which encodes an amino acid sequence encoded by a 
nucleic acid sequence selected from the group consisting of: a nucleic acid sequence 
of (a); and a nucleic acid sequence of (b); 

(d) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence encoded by a nucleic acid selected from the group consisting of: a nucleic 
acid sequence of (a); and a nucleic acid sequence of (b); and 
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(e) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); and a nucleic acid sequence of (d). 

Also preferred are isolated retroviral particles comprising a plant retroviral 
sequence encoded by a nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence which has more than 80% identity to a nucleic acid 
sequence selected from the group consisting of SEQ ID NO 1; SEQ ID NO 2; SEQ 
ID NO 3, wherein said identity can be determined using the DNAsis computer 
program and default parameters; 

(b) a nucleic acid sequence which encodes a nucleic acid selected from the group 
consisting of: SEQ ID NO 1; SEQ ID NO 2; and SEQ ID NO 3; 

(c) a nucleic acid sequence which encodes SEQ ID NO 4; 

(d) a nucleic acid sequence which encodes an amino acid sequence encoded by a 
nucleic acid sequence selected from the group consisting of: a nucleic acid sequence 
of (a); a nucleic acid sequence of (b); and a nucleic acid sequence of (c); 

(e) a nucleic acid sequence which encodes an allelic variant of an amino acid 
sequence encoded by a nucleic acid selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b); and a nucleic acid sequence of 
(c) and 

(f) a nucleic acid sequence fully complementary to a nucleic acid sequence selected 
from the group consisting of: a nucleic acid sequence of (a); a nucleic acid sequence 
of (b); a nucleic acid sequence of (c); a nucleic acid sequence of (e); and a nucleic 
acid sequence of (f). 
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Plant retroviral particles as described above, which further comprises an envelope- 
encoding nucleic acid sequence specifically described herein are preferred. Preferred 
are those retroviral particles which further comprise at least one nucleic acid 
sequence which encodes at least one agronomically-significant characteristic. 

Also provided are methods to transfer nucleic acid into a plant cell, comprising 
contacting a plant retroviral particle as described above to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. More 
preferred are methods to impart agronomically-significant characteristics to a plant, 
comprising contacting a plant retroviral particle as described to at least one plant cell 
under conditions sufficient to allow said nucleic acid to enter said cell. 

Also provided, as part of the present invention, are isolated nucleic acid 
having at least 20 contiguous nucleotides of the sequence shown in SEQ ID NO 17. 
"At least" means that this is the lower limit and the number can be any whole 
number increment up to the total number of bases in SEQ ID NO 17. For example, 
isolated nucleic acid sequences which are 25, 30, 35, 40, 45, 50, 55, 60, 65 and 70 
are within the scope of the present invention. 

The following paragraph is designed to elaborate on the best mode and is not 
indicative of the sole means for making and carrying out the present invention. This 
paragraph is not intended to be limiting. The best way to make the present nucleic 
acids is to clone the nucleic acids from the respective organisms or amplified from 
genomic cDNA by the polymerase chain reaction using appropriate primers. The 
best way to make the present retroelements is to assemble the nucleic acids using 
standard cloning procedures. Transcriptional controls can be manipulated by 
inserting enhancers in or near the 5' LTR. Marker genes or genes of interest can be 
inserted within the retroelement. The best way to make the present retroviral 
particles is to express the retroelement, preferably at high levels, in plant cells and 
the particles harvested by sucrose gradient fractionation. The best way to use the 
present nucleic acids is by allowing retroviral particles to come into contact with 
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plant cells. Expression of marker genes carried by the retroelement can be used as 
one measure of infection and integration. 

Also provided by the present invention are isolated nucleic acid molecules, 
wherein said nucleic acid molecule encodes at least a portion of a plant retroelement 
reverse transcriptase and comprises a nucleic acid sequence selected from the group 
consisting of: 

(a) a nucleic acid sequence having more than 85% identity to a 
nucleic acid sequence selected from the group consisting of even- 
numbered SEQ ID NOs inclusive from SEQ ID NO 42 to SEQ ID 
NO 164, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
having more than 85% identity to an amino acid sequence selected 
from the group consisting of odd-numbered SEQ ID NOs inclusive 
from SEQ ID NO 43 through SEQ ID NO 165, wherein said identity 
can be determined using the DNAsis computer program and default 
parameters; 

(c) a nucleic acid sequence which encodes an allelic variant of a 
nucleic acid sequence selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b). 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b). 

Seeds and plants comprising the nucleic acid molecules are also provided, as are 
nucleic acids as described which comprise gag, pol and env genes and which 
comprises adenine-thymidine-guanidine as the gag gene start codon. Moreover, 
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those nucleic acids which further comprises SEQ ID NO 5 are also provided. Also 
provided by the present invention are isolated nucleic acid molecules described, 
wherein said nucleic acid molecule encodes at least a portion of a plant envelope 
sequence and comprises a nucleic acid sequence selected from the group consisting 
of: 

(a) a nucleic acid sequence which has more than 90% identity to 
SEQ ID NO 5, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
which has greater than 85% identity to SEQ ID NO 6, wherein said 
identity can be determined using the DNAsis computer program and 
default parameters; 

(c) a nucleic acid sequence which encodes an allelic variant of SEQ 
ID NO 5; and 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b); and a nucleic acid 
sequence of c). 

Plant cells comprising this embodiment are also provided. Methods to impart 
agronomically-significant characteristics to at least one plant cell, comprising: 

contacting a nucleic acid molecule described to at least one plant cell 
under conditions sufficient to allow at least one agronomically- 
significant nucleic acid molecule to enter said cell. 

Also part of the present invention are isolated nucleic acid molecules, 
wherein said nucleic acid molecule encodes at least a portion of a plant retroelement 
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reverse transcriptase and comprises a nucleic acid sequence selected from the group 
consisting of: 

(a) a nucleic acid sequence having more than 95% identity to a 
nucleic acid sequence selected from the group consisting of even- 
numbered SEQ ID NOs inclusive from SEQ ID NO 42 to SEQ ID 
NO 164, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
having more than 95% identity to an amino acid sequence selected 
from the group consisting of odd-numbered SEQ ID NOs inclusive 
from SEQ ID NO 43 through SEQ ID NO 165, wherein said identity 
can be determined using the DNAsis computer program and default 
parameters; 

(c) a nucleic acid sequence which encodes an allelic variant of a 
nucleic acid sequence selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b). 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b). 

Seeds and plants comprising the nucleic acid molecules are also provided, as are 
nucleic acids as described which comprise gag, pol and env genes and which 
comprises adenine-thymidine-guanidine as the gag gene start codon. Moreover, 
those nucleic acids which further comprises SEQ ID NO 5 are also provided. 
Methods to impart agronomically-significant characteristics to at least one plant cell, 
comprising: 
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contacting a nucleic acid molecule described to at least one plant cell 
under conditions sufficient to allow at least one agronomically- 
significant nucleic acid molecule to enter said cell. 

Also provided are isolated nucleic acid molecule, wherein said nucleic acid 
molecule encodes at least a portion of a plant retroelement reverse transcriptase and 
comprises a nucleic acid sequence selected from the group consisting of: 

(a) a nucleic acid sequence selected from the group consisting of 
even-numbered SEQ ID NOs inclusive from SEQ ID NO 42 to SEQ 
ID NO 164, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
selected from the group consisting of odd-numbered SEQ ID NOs 
inclusive from SEQ ID NO 43 through SEQ ID NO 165, wherein said 
identity can be determined using the DNAsis computer program and 
default parameters; 

(c) a nucleic acid sequence which encodes an allelic variant of a 
nucleic acid sequence selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b). 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b). 

Seeds and plants comprising the nucleic acid molecules are also provided, as are 
nucleic acids as described which comprise gag, pol and env genes and which 
comprises adenine-thymidine-guanidine as the gag gene start codon. Moreover, 
those nucleic acids which further comprises SEQ ID NO 5 are also provided. 
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Methods to impart agronomically-significant characteristics to at least one plant cell, 
comprising: 

contacting a nucleic acid molecule described to at least one plant cell 
under conditions sufficient to allow at least one agronomically- 
significant nucleic acid molecule to enter said cell. 

Nucleic acid molecules of the present invention which further comprise at 
least one nucleic acid sequence which encodes at least one agronomically-significant 
characteristic are also provided. Those nucleic acid molecules wherein the 
agronomically-significant characteristic is selected from the group consisting of: 
male sterility; self-incompatibility; foreign organism resistance; improved 
biosynthetic pathways; environmental tolerance; photosynthetic pathways; and 
nutrient content are preferred. Also preferred are those nucleic acid molecules 
wherein the agronomically significant characteristic is selected from the group 
consisting of: fruit ripening; oil biosynthesis; pigment biosynthesis; seed formation; 
starch metabolism; salt tolerance; cold/frost tolerance; drought tolerance; tolerance 
to anaerobic conditions; protein content; carbohydrate content (including sugars and 
starches); amino acid content; and fatty acid content. 

Also provided are isolated plant retroviral particles comprising a nucleic acid 
molecule of the present invention. 

Preferred plants are selected from the group consisting of: soybean; maize; 
sugar cane; beet; tobacco; wheat; barley; poppy; rape; sunflower; alfalfa; sorghum; 
rose; carnation; gerbera; carrot; tomato; lettuce; chicory; pepper; melon; cabbage; 
oat; rye; cotton; flax; potato; pine; walnut; citrus (including oranges, grapefruit etc.); 
hemp; oak; rice; petunia; orchids; Arabidopsis; broccoli; cauliflower; brussel sprouts; 
onion; garlic; leek; squash; pumpkin; celery; pea; bean (including various legumes); 
strawberries; grapes; apples; pears; peaches; banana; palm; cocoa; cucumber; 
pineapple; apricot; plum; sugar beet; lawn grasses; maple; triticale; safflower; 
peanut; and olive. 
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In these new aspects of the invention, it is understood that the materials and 
methods described previously are useful in obtaining the present materials. 
Moreover, the discussion as to scope and usefulness of the invention, including the 
percent identities, retroviral uses and constructs, plants transfected, methods for 
improving crops, etc. are applicable for the present new aspects as well. For 
instance, combination of the previously disclosed materials with the present 
materials are certainly within the scope of the present disclosure. 

The following examples are not intended to limit the scope of the present 
invention as described and claimed. They are simply for the purpose of illustration. 
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EXAMPLES 



Example 1 Characterizing the Arabidopsis Retroelements 
("Tat" and"Athila" elements) 

Plant material and Southern hybridizations: The Arabidopsis Information 
Service supplied the following seed stocks (Kranz and Kirchheim (1987) 
Arabidopsis Inform. Serv. 24): Col-0, La-0, Kas-1, Co-4, Sei-0, Mv-0, Ll-0, Cvi-0, 
Fi-3, Ba-1, Hau-0, Aa-0, Ms-0, Ag-0, Ge-0, No-0 and Mh-0. Genomic DNA was 
extracted using Qiagen genomic tips and protocols supplied by Qiagen. For 
Southern hybridizations, the resulting DNA was digested with EcoRI, 
electrophoresed on 0.8% agarose and transferred to Gene Screen Plus membranes 
using the manufacturer's alkaline transfer protocol (New England Nuclear). All 
hybridizations were performed as described. Church and Gilbert (1984) Proc. Natl 
Acad. Sci. USA 81: 1991-1995. 

Library screening, probe preparation and PGR: Tatl clones were obtained by 
screening a Landsberg erecta (La-0) 1 phage library (Voytas et al. (1990) Genetics 
126: 713-721), using a probe derived by PCR amplification of La-0 DNA. The 
primers for probe amplification were based on the three published Tatl sequences 
(D VO 158, 5 '-GGGATCCGCAATTAGAATCT-3 ' ; DVOl 59, 5 
CGAATTCGGTCCACTTCGGA-3 '). Peleman et al. (1991) Proc. Natl. Acad. Sci. 
USA 88: 3618-3622. Subsequent probes were restriction fragments of cloned Tatl 
elements, and all probes were radiolabeled by random priming (Promega). Long 
PCR was performed using the Expand Long Template PCR System (Boehringer 
Mannheim) with LTR-specific primers (DV0354, 5'- 
CCAC AAGATTCTAATTGCGGATTC-3 ' ; DV03 55, 5'- 
CCGAAATGGACCGAACCCGACATC-3 '). The protocol used was for PCR 
amplification of DNA up to 15 kb. The following PCR primers were used to confirm 
the structure of Tatl-3: DVO405 (5 '-TTTCCAGGCTCTTGACGAGATTTG-3 ') for 
the 3' non-coding region, DV0385 (5 '-CGACTCGAGCTCCATAGCGATG-3 ') for 
the second ORF of Tatl-3 (note that the seventh base was changed from an A to a 



63 



G to make an Xhol and a Sail restriction site) and DV0371 (5'- 
CGGATTGGGCCGAAATGGACCGAA-3 ') for the 3' LTR. 

DNA sequencing: Clones were sequenced either by the DNA sequencing 
facility at Iowa State University or with the fmol sequencing kit (Promega). DNA 
from the 1 phage clones was initially subcloned into the vector pBluescript II KS- and 
transformed into the E. coli host strain XL1 Blue (Stratagene). AUSUBEL et al. 
(1987) Current Protocols in Molecular Biology. Greene/Wiley Interscience, New 
York. Subclones in the vector pMOB were used for transposon mutagenesis with the 
TN 1000 sequencing kit (Gold Biotechnologies). Transposon-specific primers were 
used for DNA sequencing reactions. 

Sequence analysis: Sequence analysis was performed using the GCG 
software package (Devereux et al. (1984) Nucl. Acids Res, 12: 387-395), DNA 
Strider 1.2 (Marck (1991) DNA Strider 1.2, Gif-sur-Yvette, France), the BLAST 
search tool (Altschul et al. (1990) J. Mol. Biol. 215: 403-410) and the tRNAscan-SE 
1.1 program (Lowe and Eddy (1997) Nucl. Acids Res. 25: 955-964). Phylogenetic 
relationships were determined by the neighbor-joining distance algorithm using 
Phylip (Felsenstein (1993) PHYLIP (Phylogeny Inference Package). Department of 
Genetics, University of Washington, Seattle; SAITOU andNEI (1987) Mol. Biol. 
Evol 4: 406-425) and were based on reverse transcriptase amino acid sequences that 
had been aligned with ClustalWl.7. THOMPSON, et al. (1994) Nucl. Acids Res. 22: 
4673-4680. Transmembrane helices were identified using the PHDhtm program. 
ROST et al. (1995) Prot. Science 4: 521-533. All DNA sequences have been 
submitted to the DDBS/EMBL/GenBank databases under the accession numbers 
X12345, X23456, X34567 and X45678. 

RESULTS 

Tatl is a retrotransposon: Tatl insertions share features with retrotransposon 
solo LTRs. We reasoned that if Tatl is a retrotransposon, then there should be full- 
length elements in the genome consisting of two Tatl sequences flanking an internal 
retrotransposon coding region. To test this hypothesis, additional Tatl elements 
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were isolated by screening a Landsberg (La-0) genomic DNA library with a Tatl 
probe. Twenty-one 1 phage clones were isolated and Southern analysis revealed two 
clones (pDW42 and pDW99) each with two copies of Tatl (data not shown). The 
two Tatl elements in each clone were sequenced, along with the intervening DNA. 
All Tatl sequences shared >89% nucleotide identity to the previously characterized 
Tatla - Tatlc elements. Peleman et al (1991) Proc. Natl. Acad. Sci. USA 88: 3618- 
3622. In clone pDW99, the 5 ? and 3' Tatl sequences were 433 bases in length and 
only differed at two base positions. These Tatl sequences also had conserved 
features of LTRs, including the dinucleotide end-sequences (5 ? TG-CA 3') that were 
part of 12 base inverted terminal repeats. If the two Tatl elements in clone pDW99 
were retrotransposon LTRs, then both, along with the intervening DNA, should be 
flanked by a target site duplication. A putative five base target site duplication 
(TATGT) was present immediately adjacent to the 5 5 and 3' Tatl elements, 
supporting the hypothesis that they and the intervening DNA inserted as a single 
unit. In clone pDW42, the 5' Tatl was 432 bases in length and shared 98% 
nucleotide sequence identity to the 3' Tatl. The last -74 bases of the 3 5 Tatl was 
truncated during library construction and lies adjacent to one phage arm. A target 
site duplication, therefore, could not be identified in this clone. 

DNA sequences were analyzed for potential coding information between the 
5' and 3' Tatl elements. Nearly identical ORFs of 424 and 405 amino acids were 
found encoded between the Tatl sequences in pDW42 and pDW99, respectively. 
The derived amino acid sequences of these ORFs were used to search the DNA 
sequence database with the BLAST search tool, and significant similarity was found 
to the Zea mays retrotransposable element Zeon-1 (p = 4.4e-08). HU et al. (1995) 
Mol Gen. Genet. 248: 471-480. The ORFs have -44% similarity across their 
entirety to the 628 amino acid ORF encoded by Zeon-1 (see below). The Zeon-1 
ORF includes a zinc finger motif characteristic of retrotransposon gag protein RNA 
binding domains. Hu et al (1995) Mol. Gen. Genet. 248: 471-480. Although the 
Tatl ORFs do not include the zinc finger motif, the degree of similarity suggests that 
they are part of a related gag protein. 
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If the Tatl sequences in pDW42 and pDW99 defined retrotransposon 
insertions, a PBS would be predicted to lie adjacent to the 5' Tatl elements in both 
clones. The putative Tatl PBS shares similarity with PBSs of Zeon-1 and another 
maize retrotransposon called Cinful (see below), but it is not complementary to an 
initiator methionine tRNA as is the case for most plant retrotransposons. 
Additionally, a possible polypurine tract (PPT), the primer for second strand cDNA 
synthesis, was observed one base upstream of the 3 5 Tatl sequence in both phage 
clones (5 '-GAGGACTTGGGGGGCAAA-3 '). We concluded from the available 
evidence that Tatl is a retrotransposon, and we have designated the 3960 base 
insertion in pDW42 as Tatl-1 and the 3879 base insertion in pDW99 as Tatl -2. It 
is apparent that both Tatl-1 and Tatl-2 are non-functional. Their ORFs are truncated 
with respect to the coding information found in transposition-competent 
retrotransposons, and they lack obvious pol motifs. 

In light of our findings, the previously reported Tatl sequences can be 
reinterpreted. Tat la and Tat lb, which are flanked by putative target site 
duplications, are solo LTRs. Tatlc, the only element without a target site duplication, 
is actually the 5' LTR and part of the coding sequence for a larger Tatl element. 

Copy number of Tatl among A. thaliana ecotypes: To estimate Tatl copy 
number, the 5' LTR, gag and the 3' non-coding region were used as separate probes 
in Southern hybridizations. The Southern filters contained genomic DNA from 17 
ecotypes representing wild populations of A. thaliana from around the world. This 
collection of ecotypes had previously been used to evaluate retrotransposon 
population dynamics. Konieczny et al. (199 1) Genetics 127: 801-809; Voytas et al. 
(1990) Genetics 126: 713-721; Wright et al. (1996) Genetics 142: 569-578. Based 
on the hybridization with the gag probe, element copy number ranges from two to 
approximately ten copies per ecotype. The copy number of the LTRs is higher, 
likely due to the presence of two LTRs flanking full-length elements or solo LTRs 
scattered throughout the genome. The Tatl copy number contrasts with the copy 
numbers (typically less than three per ecotype) observed for 28 other A. thaliana 
retrotransposon families. Konieczny et al. (1991) Genetics 127: 801-809; Voytas et 
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al. (1990) Genetics 126: 713-721; Wright et al. (1996) Genetics 142: 569-578. In 
addition, the Tat 1 -hybridizing restriction fragments are highly polymorphic among 
strains. This degree of polymorphism, coupled with the high copy number, 
suggested that Tatl has been active in transposition since the separation of the 
ecotypes. 

The Tatl 3' non-coding region contains DNA sequences from elsewhere in 
the genome: In an attempt to identify a complete and functional Tatl element, LTR- 
specific primers were used in PCR reactions optimized for amplification of large 
DNA fragments. Most full-length retrotransposable elements are between five and 
six kb in length. DNAs from all 17 ecotypes were used as templates, and each gave 
amplification products of -3.2 kb, the size predicted for Tatl-1 and Tat 1-2 (data not 
shown). In La-0, however, a 3.8 kb PCR product was also recovered. This PCR 
product was cloned, sequenced and called Tat 1-3. This insertion is expected to be 
about 4.6 kb in total length if the LTR sequences are included. 

Tatl-3 differed from Tatl-1 and Tat 1-2 in that it had two ORFs separated by 
stop codons and a 477 base insertion in the V non-coding region. The first ORF 
(365 amino acids) was similar to but shorter than the ORFs of the other Tatl 
elements. The sequences constituting the second ORF (188 amino acids) were not 
present in the other Tatl insertions and were not related to other sequences in the 
DNA databases. Database searches with the 477 base insertion in the 3' non-coding 
region, however, revealed three regions of similarity to other genomic sequences. 
A region of 113 bases matched a region of 26 bp repeats in the 5' untranslated 
sequence of the AT-P5C1 mRNA, which encodes pyrroline-5-carboxylate reductase 
(p = 2.1e-19). Verbruggen et al. (1993) Plant Physiol. 103: 771-781. In addition, 50 
bases appear to be a remnant of another retrotransposon related to Tatl. These 50 
bases are 71% identical to the V end of the Tatl-3 LTR and the putative primer 
binding site. The putative primer binding site, however, is more closely related to 
those of other plant retrotransposons such as Huck-2 (Sanmiguei et al. (1996) 
Science 274: 765-768). Finally, sequences in the remainder of the insertion showed 
significant similarity to a region on chromosome 5. To confirm that Tatl-3 was not 
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a PCR artifact, two additional primer pairs were used in separate amplifications. 
Both amplifications gave PCR products of the predicted sizes, which were cloned 
and confirmed to be Tat 1-3 by DNA sequencing. 

PCR amplifications with the additional primer pairs also yielded a product 
0.8 kb longer than that expected for Tatl-3. This product was cloned, sequenced and 
found to be another Tatl element, designated Tat 1-4. This element has sequences 
similar to a Tatl LTR, polypurine tract and the second ORF of Tatl-3. In Tatl -4, 
1182 bases of DNA are found in the 3' non-coding region at the position 
corresponding to the 477 base insertion in Tatl-3. This region does not match any 
sequences in the DNA databases. 

Other Tatl -like elements in A. thaliana: A BLAST search of DNA sequences 
generated by the A. thaliana genome project identified two more solo LTRs similar 
to Tatl. All share similarities throughout, but most strikingly, they are very well 
conserved at the 5' and 3' ends where it is expected integrase would bind. 
Braiterman and Boeke (1994) Mol. Cell. Biol. 14: 5731-5740. These conserved 
end-sequences suggest that the integrases encoded by full-length elements are also 
related, and that the LTRs have evolved under functional constraints; that is, they are 
not simply degenerate Tatl LTRs. The two new LTRs are designated as Tat2-1 and 
Tat3-1. Tat2-1 is 418 bases long, is flanked by a five base target site duplication 
(CTATT) and is -63% identical to the Tat 1-2 5' LTR. Tat3-1 is 463 bases long and 
is also flanked by a target site duplication (ATATT). Tat3-1 is -53% identical to the 
Tatl -2 5 ? LTR. 

Tatl and Athila are related to Ty3/gypsy retrotransposons: Further analysis 
of data from the A. thaliana genome project revealed two slightly degenerate 
retrotransposons with similarity to the Tatl ORF. These elements were identified 
within the sequence of the PI phage clones MXA21 (Accession AB005247; bases 
54,977-66,874) and MX1 10 (Accession AB005248; bases 24,125-35,848). Each has 
two LTRs, a putative PBS, and long ORFs between their LTRs. The genetic 
organization of these elements is depicted in Figures 5A and 6A. Amino acid 
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sequence analysis indicated the presence of an RNA binding domain that defines gag 
in both elements. This region is followed by conserved reverse transcriptase, 
RNaseH, and integrase amino acid sequence domains characteristic of pol (data not 
shown). Classification of eukaryotic retrotransposons into the Tyl/copia elements 
(Pseudoviridae) and Ty3/gypsy elements (Metaviridae) is based on pol gene 
structure. Boeke et al. (1998) Metaviridae. In Virus Taxonomy: ICTV Vllth Report, 
edited by F. A. Murphy. Springer- Verlag, New York.; Boeke et al. (1998b) 
Pseudoviridae. In Virus Taxonomy: ICTV Vllth Report, edited by F. A. Murphy. 
Springer Verlag, New York. The domain order of the pol genes (reverse 
transcriptase precedes integrase ) and similarities among their encoded reverse 
transcriptases (see below) identifies these elements as the first fiill-length A. thaliana 
Ty3/gypsy elements. 

Because the characterized Tatl insertions do not encode pol genes, this 
element family could not be classified. However, the amino acid sequence of the 
Tatl-2 ORF is 51% similar to the gag region of the MXA21 retrotransposon. Since 
plant retrotransposons within the Tyl/copia or Ty3/gypsy families, even those with 
highly similar pol genes, share little amino acid sequence similarity in their gag 
regions, Tatl is likely a Ty3/gypsy element. This conclusion is further supported by 
the report that the Tat-like Zeon-1 retrotransposon is very similar to a Z. mays 
Ty3/gypsy element called cinful (Bennetzen (1996) Trends Microbiol. 4: 347-353); 
however, only the 5' LTR and putative primer binding site (PBS) sequences are 
available in the sequence database for analysis (Accession U68402). Because of the 
extent of similarity to Tatl, we have named the MXA21 insertion Tat4-1. 

The gag region of the MX1 10 element is 62% similar (p = l.le-193) to the 
first ORF of Athila, which has previously been unclassified (Pelissier et al (1995) 
Plant Mol. Biol. 29: 441 452). This implies that Athila is also a Ty3/gypsy element, 
and we have designated the MX1 10 insertion as Athilal-1. Our classification of 
Athila as a Ty3/gypsy element is further supported by the observation that the Athila 
gag amino acid sequences shares significant similarity to the gag protein encoded by 
the cyclops-2 Ty3/gypsy retrotransposon of pea (Accession AJ000640; p = l.le-46; 
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data not shown). Further analysis of the available A. thaliana genome sequences 
identified three additional Athila homologs. They include an additional Athilal 
element, designated Athila 1-2, and two more distantly related Athila-like elements, 
designated Athila2-1 and Athila3-L 

In addition to similarities among their gag amino acid sequences, the Tat 
elements have short LTRs (<550 bp) and long 3 ? non-coding regions (>2 kb). In 
contrast, the Athila-like elements have long LTRs (>1.2 kb) and are very large 
retrotransposons (>1 1 kb). One additional feature to note about both the Athila-like 
and Tat-like elements is the high degree of sequence degeneracy of their internal 
coding regions. This contrasts with the near sequence identity of their 5' and V 
LTRs, which is typically greater than 95%. Because a single template is used in the 
synthesis of both LTRs, LTR sequences are usually identical at the time of 
integration. The degree of sequence similarity between the LTRs suggests that most 
elements integrated relatively recently. The polymorphisms observed in the internal 
domains of these insertions, therefore, may have been present in their progenitors, 
and these elements may have been replicated in trans. 

A novel, conserved coding region in Athila elements: A surprising feature 
of Athilal-1 is the presence of an additional ORF after integrase. Like gag, this ORF 
shares significant similarity across its entirety (p = 3.8e-08) to the second ORF of 
Athila. This ORF is also encoded by the Athila2-1 and Athila3-1 elements, although 
it is somewhat more degenerate. The presence of this coding sequence among these 
divergent retrotransposons suggests that it plays a functional role in the element 
replication cycle. However, the ORF shows no similarity to retrotransposon gag or 
pol genes. The retroviruses and some Ty3/gypsy retrotransposons encode an env 
gene after integrase. Although not well-conserved in primary sequence, both viral 
and retrotransposon envelope proteins share some structural similarities. They are 
typically translated from spliced mRNAs and the primary translation product 
encodes a signal peptide and a transmembrane domain near the C-terminus. All four 
families of Athila elements encode a domain near the center of the ORF that is 
strongly predicted to be a transmembrane region (70% - 90% confidence, depending 
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on the element analyzed) (ROST et al. (1995) Prot. Science 4: 521-533). Two 
retrotransposons, Athila and Athila2-1, also have a hydrophobic transmembrane 
domain near the 5' end of their env-like ORFs, which may serve as a secretory signal 
sequence. Von Heijne (1986) NucL Acids Res. 14: 4683-4690. 

Two lineages of plant Ty3/gypsy retrotransposons: Relationships among 
Ty3/gypsy retrotransposons from A. thaliana and other organisms were assessed by 
constructing a neighbor-joining tree of their reverse transcriptase amino acid 
sequences. Included in the analysis were reverse transcriptases from two additional 
families of A. thaliana Ty3/gypsy elements that we identified from the unannotated 
genome sequence data (designated Tma elements; Tmal-1 and Tma3-1); two other 
Tma element families were identified in the genome sequence that did not encode 
complete reverse transcriptases (Tma2-1 and Tma4-1; Table 1). Also included in the 
phylogenetic analyses were reverse transcriptases from a faba bean retrotransposon 
and the cyclops-2 element from pea. The plant Ty3/gypsy group retrotransposons 
resolved into two lineages: One was made up of dell from lily, the IFG7 
retrotransposon from pine, reina from Z. mays, and Tmal-1 and Tma3-1 . This group 
of elements formed a single branch closely related to numerous fungal 
retrotransposons (branch 1). The second branch (branch 2) was well-separated from 
all other known Ty3/gypsy group elements, and was further resolved into two 
lineages: Athilal-1, cyclops-2 and the faba bean reverse transcriptase formed one 
lineage (the Athila branch), and Tat4-1 and Grande 1-4 from Zea diploperennis 
formed a separate, distinct branch (the Tat branch). 

Primer binding sites: Most plant Ty 1/copia retrotransposons as well as the 
branch 1 Ty3/gypsy elements have PBSs complementary to the 3'-end of an initiator 
methionine tRNA. This is not the case for any of the branch 2 Ty3/gypsy elements. 
We compared the putative PBSs of Tat-branch and Athila-branch elements to known 
plant tRNA genes as well as to the 1 1 tRNA genes that had been identified to date 
in sequences generated by the A. thaliana genome project. In addition, we searched 
the unannotated A. thaliana genome sequences and identified 30 more A. thaliana 
tRNA genes using the program tRNAscan-SE (Lowe and Eddy (1997) Nucl. Acids 
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Res. 25: 955-964). The PBS of Tatl is complementary to 10 bases at the 3' end of 
the asparagine tRNA for the AAC codon; these 10 bases are followed by a two base 
mismatch and six additional bases of perfect complementarity. The Tat4-1 PBS is 
complementary to 20 bases at the V end of the arginine tRNA for the AGG codon 
with one mismatch 10 bases from the 3 5 end; Huck-2, Grande-zml, Grande 1-4, and 
the retrotransposon-like insertion in the 3' non-coding region of Tat 1-3 all have 20- 
base perfect complementarity to this tRNA. The PBS of Athilal-1 is perfectly 
complementary to 15 bases at the 3' end of the aspartic acid tRNA for the GAC 
codon, and Athila and Athila2-1 have 13 bases of complementarity to this tRNA, At 
this time there is no known plant tRNA complementary to the PBS of Zeon-1, which 
has the same PBS as the maize retrotransposon cinful. As more tRNA sequences 
become available, a candidate primer may be identified for these elements. 

Example 2 Characterizing the Pisum sativum Retroelement 
("Cyclops" element) env gene 

After identifying the retrovirus-like elements in A. thaliana, the element 
called Cyclops2 from Pisum sativum (Chavanne et al. (1998) Plant Mol. Biol. 
37:363-375) was examined. Comparison of this element to the Athila-like elements 
both in size and amino acid and nucleotide sequence composition was made. 
Cyclops2 also encodes an open reading frame (ORF) in the position corresponding 
to the env-like gene of the Athila elements. This Cyclops2 ORF was examined using 
the same methods used to characterize the Athila group env-like genes (see Example 
1). The Cyclops2 ORF was found to have a potential splice site at its N-terminus 
and transmembrane domains at the N-terminus ? the central region and the C- 
terminus. Based on the presence of these features, it was concluded that Cyclops2 
is a retrovirus-like retroelement that encodes on env-like gene. 
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Example 3 Obtaining the Soybean Retroelements 
("Calypso" elements) 

Materials and Methods 

Library Screening and Southern Hybridization, A soybean genomic lambda 
phage library (line L85-3044) was initially screened with a reverse transcriptase 
probe under low stringency conditions (50 degrees Celsius with a 1% SDS wash) 
(Church and Gilbert (1984) Proc. Natl. Acad Sci. USA 81:1991-1995). The library 
was previously described (Chen et al. (1998) Soybean Genetics Newsletter 25:132- 
134). The probe was obtained by PCR amplification of genomic P. sativum DNA 
using primers based on the reverse transcriptase of Cyclops2 ( DVO701 and 
DVO702). All probes were radio-labeled using random primers and protocols 
supplied by Promega (Madison, WI). For Southern hybridizations, DNA was 
digested, electrophoresed on 0.8% agarose gels, and transferred to Gene Screen Plus 
membranes using the manufactureris alkaline transfer protocol (New England 
Nuclear, Boston, MA). All high stringency hybridizations were as described 
(Church and Gilbert (1984) Proc. Natl. Acad. Sci. USA 81:1991-1995). 

DNA sequencing. Lambda phage clones were subcloned into the vector 
pBluescript KSII - and transformed into the E.coli host strain XL1 Blue (Stratagene, 
La Jolla, CA) (Ausubel et al, Current Protocols in Molecular Biology (Greene 
Publishing Associates, Inc., 1993). Subclones were sequenced by primer walking 
at the Iowa State University DNA sequencing facility. 

Sequence Analysis. DNA Sequence analysis was performed using the GCG 
software package (Devereux et al. (1984) Nucleic Acids Res. 12:387-395), DNA 
Strider 1.2 (Marck (1991) DNA Strider 1.2, Gif-sur-Yvette, France) and the BLAST 
search tool (Altschul et al. (1990) J. Mol. Biol. 215: 403-410). Phylogenetic 
relationships were determined by the neighbor-joining distance algorithm (Saitou 
and Nei (1987) Mol Biol Evol. 4: 406-425) using PAUP v4.0 beta 1 (Swofford 
(1993) Illinois Natural History Survey, Champaign, IL) and were based on reverse 
transcriptase amino acid sequences that had been aligned with ClustalX vl,63b 
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5 (Thompson et al. (1994) Nucl. Acids Res. 22: 4673-4680). Transmembrane helices 

were identified using the PHDhtm program and TMPred (Rost et al. (1995) Prot. 
Science 4: 521-533; Hofmann and Stoffel (1993) Biol. Chem. 374:166). 

Results 

10 Retrovirus-like elements in Glycine max. Soybean retrovirus-like elements 
were identified by a low stringency (50 degrees C) screen of a soybean lambda 
library using a reverse transcriptase probe. The probe was based on a sequence from 
Cyclops2 (Chavanne et al. (1998) Plant Mol. Biol. 37:363-375). The screen 
produced 63 lambda clones that appeared to contain a retrovirus-like reverse 

15 transcriptase based on hybridization to the probe. Thirty-five of these putative 

elements were sequenced to varying degrees and 24 encoded readily identifiable 
retrovirus-like sequences. Most of the elements were distantly related and had 

O premature stop codons, frame shifts, deletions or insertions. A related group of three 

elements and another related pair were completely sequenced and analyzed. The 

11 three elements in the first group are referred to as Calypso 1-1, Calypso 1-2, and 
fj Calypso 1-3. The elements in the second pair are referred to as Calypso2-l and 
O Calypso2-2. The remaining soybean retrovirus-like elements will be given the 

Calypso name and a sequential designator number based on their family grouping. 

S The Calypso retrovirus-like elements have the same overall structure and 

f!J sequence homology as the previously described Athila and Cyclops elements. The 

:=K elements are ~12kb in length; they have a 5' LTR, a PBS (Primer Binding Site), a 

gag protein, a pol protein, a spacer, an env-like protein, another spacer region, a PPT 
(Polypurine Tract) and a 3' LTR. The LTRs vary from -1.3 to ~1.5kb in length and 
30 characteristically begin with TG and end with CA. The PBS is similar to that used 

by the Athila and Cyclops elements; it is 4 to 6 bases past the 5' LTR and matches 
the 3' end of a soybean aspartic acid tRNA for 18 to 19 bases with 1 mismatch. The 
fact that the sequences of the Calypso primer binding sites are shared with the A. 
thaliana and P. sativum retrovirus-like elements, indicates that this sequence is a 
35 unique marker for envelope-encoding retroelements. The gag protein extends -850 

amino acids and encodes a zinc finger domain (characterized by the amino acid motif 
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CxxCxxxHxxxxC) and a protease domain (characterized by the amino acid motif 
LIDLGA). These domains are located at approximately the same positions within 
gag as in other retroelements. The -600 amino acid reverse transcriptase region 
follows gag and has the conserved plant retrovirus-like motifs which approximate 
the following amino acids: KTAF, MP/SFGLCNA, V/I/MEWMDDFS/WV/I, 
FELMCDASDYAI/VGAVLGQR, and 
YATT/IEKEL/ML AIVF/Y AL/TEKJFR/KS YL WGSR/KV, respectively. The -450 
amino acid integrase domain has the plant retrovirus-like integrase motifs that 
approximate HCHxSxxGGH30xCDxCQR for the Zn finger as well as two other 
motifs that approximate WGIDFI/V/MGP, and PYHPQTxGQA/VE . After 
integrase, there is a ~0.7kb spacer then a -450 amino acid env-like protein coding 
region. The env-like protein of the Calypso elements is well conserved through most 
of the ORF but conservation decreases toward the C-terminus. The conservation 
includes 2 or 3 presumed transmembrane domains and a putative RNA splice site 
acceptor. The env-like protein is followed by a ~2 kb spacer then a polypurine tract 
with the approximate sequence ATTTGGGGG/AANNT. The 3' LTR starts 
immediately after the final T of the PPT. 

Calypso elements are abundant and heterogeneous. The Calypso elements 
appear to be abundant in the soybean genome. High stringency Southern blots of 
soybean DNA probed with reverse transcriptase, gag or env-like sequences produced 
smeared hybridization patterns, suggesting that the elements are abundant and 
heterogeneous. Their heterogeneity was also supported by DNA sequence analysis, 
which revealed a maximum of 93% nucleotide identity among elements, and most 
elements averaged -88% nucleotide identify. This identity can be region-specific 
or dispersed over the elements entirety. For example, reverse transcriptase, 
integrase and envelope-like coding regions may be well conserved, whereas the 
LTR, gag and spacer regions may have very little sequence conservation. 

Phylogenetic analysis of Calypso reverse transcriptase. The reverse 
transcriptase of retroelements is the preferred protein for assessment of phylogenetic 
relationships (Xiong and Eickbush (1990) EMBO J. 9:3353-3362). This is due to the 
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high degree of amino acid sequence conservation found in reverse transcriptase 
proteins from many sources. The Calypso retrovirus-like elements were compared 
to previously described Ty3/gypsy and retrovirus-like elements from plants, fungi 
and invertebrate animals. The Calypso elements formed a distinct group with other 
plant retrovirus-like elements from A, thaliana and P. sativum and Faba bean. This 
group did not include plant Ty3/gypsy elements that are members of the metavirus 
genus. This indicates that the plant retrovirus-like elements from these four plant 
species are closely related and form a new element group that may be present in all 
or most plant species. 

The Calypso reverse transcriptase and integrase are well-conserved. Frame 
shifts in the retrovirus-like elements were repaired through sequence comparison 
between the retrovirus-like elements from A. thaliana, P. sativum and G. max. 
Restoration typically involved an insertion or deletion of a single nucleotide or a 
single nucleotide substitution. When the edited ORFs of seven plant retrovirus-like 
elements from three species were compared, it was found that the gag domain had 
very little conservation. The amino acid sequence around the protease domain was 
reasonably conserved (-50%) but the reverse transcriptase and integrase domains 
were highly conserved (-70%). 

The env-like ORF of Calypso is well-conserved. Animal retrovirus env 
proteins share little in common. They are however cleaved into two functional units 
that consist of the surface (SU) and transmembrane (TM) peptides. The SU peptide 
contains a transmembrane secretory signal at the N-terminus. The TM peptide has 
two transmembrane domains, one at the N-terminus, which functions in membrane 
fusion, and another near the C-terminus, which acts as an anchor site. The retrovirus 
env protein is expressed from an RNA that is spliced near the beginning of the env 
ORF. There are currently nine Athila group elements from A. thaliana that have an 
identifiable env-like ORF. Alignment of the env-like amino acid sequence shows 
that there are five subgroups of env-like proteins in the Athila family. Three are 
distinct, four are closely related and another pair is closely related. As a whole, these 
env-like sequences share limited homology over the entire length of the ORF, but 
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5 within subgroups, they share high homology (data not shown). Some of the Athila 

env-like proteins have an apparent secretory peptide and a central transmembrane 
domain, suggesting that they may have an env-like function. 

Among the Calypso elements, seven have been characterized that encode 
10 env-like ORFs. These env-like ORFs form four families that have a high degree of 

overall sequence similarity beginning at the first methionine and continuing for three 
quarters of the ORF; sequence similarity falls off dramatically near the C-terminus. 
The amino acid sequence at the first methionine has the consensus sequence 
QMASR/KKRR/KA, which appears to be a nuclear targeting signal, however, the 
15 program PSORT only predicts a 0.300 confidence level for this targeting role (Nakai 

and Horton (1999) Trends Biochem. Sci. 24:34-36). A similar sequence (ASKKRK) 
is found at the same position in the env-like ORF of Cyclops2, suggesting that it 
O serves a similar purpose. No other potential targeting peptide stands out from the 

It? sequence that has been analyzed so far. There is a conserved region that is predicted 

lid to be a transmembrane domain near the center of the Calypso env-like protein and 

^ a second transmembrane domain located at variable positions near the C-terminus. 

Q These may be the fusion and anchor functions of a TM peptide. It should also be 

" ' noted that five of the seven ORFs are predicted to have a transmembrane domain that 

p is just before and includes the first methionine. This N-terminal transmembrane 

M domain may be a secretory signal of an SU peptide. The program TMpred estimates 

■ I ) these transmembrane domains to be significant based on a score >500 (Hofmann and 

Stoffel (1 993) Biol. Chem. 374: 166). These three transmembrane domains are found 
in the Cyclops2 env-like protein at similar locations but at a reduced significance 
score. Another feature of the Calypso env-like ORF is the conserved splice site that 
30 is predicted to be at the first methionine by the program NetGene2 v. 2.4 with a 

confidence level of 1 .00 (Hebsgaard et al. (1996) Nucleic Acids Res. 24:3439-3452); 
Brunak et al. (1991) J. Mol. Biol 220:49-65). There are other less preferred putative 
splice sites in the region, but only the splice site near the methionine is optimally 
placed and conserved in all seven env-like ORFs. 
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5 Example 4 Obtaining the Generic Plant Retroelements 

("Generic" elements) 

ClustalX vl.63b (Thompson et al. (1994) NucL Acids Res. 22: 4673-4680) was used 
to align nucleotide sequences of Calypso 1-1, Calypso 1-2 and Calypso 1-3. A 

10 consensus sequence was generated from the ClustalX output. The consensus 

sequence file was then translated and compared using ClustalX to amino acid 
sequences of retrovirus-like elements from soybean, pea (Cyclops2) and A. thaliana 
(Athila-like elements) using the GCG computer software package (Devereux et al. 
(1984) Nucleic Acids Res. 12:387-395). For coding regions encompassing protease, 

15 reverse transcriptase and integrase, a new consensus sequence was generated that 

best matched the coding information in all elements. This second consensus 
sequence forms the protease, reverse transcriptase and integrase genes of the generic 

0 element. The gag gene of the generic element is a consensus sequence generated by 

editing alignments between Calypsol-1 and Calypso2-2. The env gene is a 

3B consensus sequence based on env gene sequence alignments of all Calypso elements. 

1^ All non-coding regions for the generic element were obtained >from Calypso 1-2, 

Q with the exception of the LTRs, which were taken from Calypso 1 - 1 . 

O A generic retrovirus will be constructed by first generating a DNA sequence that 

25: approximates the sequence of the generic element. An element that closely matches 

lU the consensus - for example, Calypsol-1 ~ will be modified by PCR-based site- 

?i directed mutagenesis (Ausubel et al., Current Protocols in Molecular Biology 

(Greene Publishing Associates, Inc., 1993). Modifications will be sequentially 
introduced into the starting element until it conforms to the sequence of the generic 
30 element. 

The generic element will be modified so that it will be expressed at high levels in 
plant cells. This will be accomplished by inserting an enhancer ~ such as the 
cauliflower mosaic virus 35S enhancer ~ into the 5' LTR. To monitor replication, 
35 a marker gene will be inserted into the virus between the end of the coding region for 

the env gene and the polypurine tract. The marker gene may encode resistance to an 
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herbicide or antibiotic. The modified generic element will then be introduced into 
plant cells by standard means of plant transformation. Because the modified generic 
element will be expressed at high levels, retroviral particles will be produced by the 
host plant cell. These will be harvested and purified by passing cell lysates over 
sucrose density gradients. 

The plant retroviral particles will be incubated in the presence of non-transformed 
plant cells. The virus will associate with the plant cell and fuse with the plant cell 
membrane. The mRNA carried by the virus will be reverse transcribed and the 
resultant cDNA will be integrated into the genome of the plant. The integration of 
the viral DNA and the expression of the marker gene it carries will confer antibiotic 
resistance to the plant cell. Cells that carry integrated viruses can be identified 
through genetic selection. 

Examples Obtaining a library of Reverse Transcriptase 
sequences 

The degenerate oligos DV01197 (5' GTG-CGN-AAR-GAR-GTN-NTN-AAR-YT 
3' for the N terminal amino acid sequence VRKEVLKL) and DVOl 198 (5' 
AAC-YTT-NGW-RAA-RTC-YTT-DAT-RAA 3' for the C terminal amino acid 
sequence VKSFDKIF) were used to amplify the Xiong/Eickbush plant retrovirus 
reverse transcriptase domain from genomic DNA of the following plants: New 
sequences were obtained from Nicotiana tabacum (Tobacco), Platanus 
occidentalis (Sycamore), Gossypium hirsutum (Cotton), Lycopersicon 
esculentum (Tomato), Solanum tuberosum (Potato), Oryza sativa (Rice), 
Triticum aestivum (Wheat), Hordeum vulgare (Barly), Sorghum bicolor 
(Sorghum), Avena sativa (Oat), Secale cereale (Rye). No sequence was 
obtained for Pinus coulteri (Big-cone pine), Zea mays (Corn), Zea mays 
subspecies, parviglumis (Teosinte), and a Tripsacum species. A positive 
control for PGR was used to obtain previously known sequences from: 
Arabidopsis thaliana, Pisum sativum (pea) and three varieties (Hark 89, L85 
and Williams) of Glycine max (soybean). 
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The conditions for PCR were as follows: 50 microliter reactions were set 
up with 5 microliters of Promega Taq enzyme buffer, 1 microliter of Taq 
enzyme, 5 microliters of Promega 25 millimolar magnesium chloride, 100 
nanograms genomic DNA, 5 microliters of 2.5 millimolar Promega dNTP 
(deoxynucleotide mixture) and 7.5 microliters of each oligo from a 20 
picomole/microliter solution. The reaction volume was brought to 50 
microliters with deionized water. PCR was done with a 92 degrees Celsius 
melting temperature for 2 minutes for the first cycle and 20 seconds for 
each cycle thereafter, 50 degrees Celsius annealing temperature for 30 
seconds and 72 degrees Celsius extension for 1 minute 30 seconds. There 
was a total of thirty cycles. Based on known sequence data, a 762 base 
pair band was expected for each PCR reaction. 

The PCR reactions were run out on a 0.8% agarose gel, the approximately 
sized 762 based pair band was excised for each species and ligated to a 
T- vector pBLUESCRIPT II KS-. The ligations were transformed into the 
E.coli strain XL1 BLUE, selected and sequenced. The results are in the Sequence 
Listing, at SEQ ID Nos 42 through 165, with the even numbered sequences in that 
range being the DNA sequences identified, and the odd-numbered sequences being 
the amino acid sequences deduced from the DNA sequences. 

Although the present invention has been fully described herein, it is to be noted that 
various changes and modifications are apparent to those skilled in the art. Such 
changes and modifications are to be understood as included within the scope of the 
present invention as defined by the appended claims. 
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WHAT IS CLAIMED IS: 



1. An isolated nucleic acid molecule, wherein said nucleic acid 
molecule encodes at least a portion of a plant retroelement reverse 
transcriptase and comprises a nucleic acid sequence selected from the 
group consisting of: 

(a) a nucleic acid sequence having more than 85% identity to a 
nucleic acid sequence selected from the group consisting of even- 
numbered SEQ ID NOs inclusive from SEQ ID NO 42 to SEQ ID 
NO 164, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
having more than 85% identity to an amino acid sequence selected 
from the group consisting of odd-numbered SEQ ID NOs inclusive 
from SEQ ID NO 43 through SEQ ID NO 165, wherein said identity 
can be determined using the DNAsis computer program and default 
parameters; 

(c) a nucleic acid sequence which encodes an allelic variant of a 
nucleic acid sequence selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b). 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b). 

2. A seed comprising a nucleic acid of claim 1 . 

3. A plant comprising a nucleic acid of claim 1. 
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4. A nucleic acid molecule of 1, which comprises gag, pol and env 
genes and which comprises adenine-thymidine-guanidine as the gag 
gene start codon. 

5. A nucleic acid molecule of claim 2, which further comprises SEQ ID NO 5. 

6. An isolated nucleic acid molecule of claim 1, wherein said nucleic 
acid molecule encodes at least a portion of a plant envelope sequence 
and comprises a nucleic acid sequence selected from the group 
consisting of: 

(a) a nucleic acid sequence which has more than 90% identity to 
SEQ ID NO 5, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
which has greater than 85% identity to SEQ ID NO 6, wherein said 
identity can be determined using the DNAsis computer program and 
default parameters; 

(c) a nucleic acid sequence which encodes an allelic variant of SEQ 
ID NO 5; and 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b); and a nucleic acid 
sequence of c). 

7. A plant cell comprising an isolated nucleic acid molecule of claim 6. 

8. A method to impart agronomically-significant characteristics to at 
least one plant cell, comprising: 
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contacting a nucleic acid molecule of claim 6 to at least one plant cell 
under conditions sufficient to allow at least one agronomically- 
significant nucleic acid molecule to enter said cell. 

9. An isolated nucleic acid molecule, wherein said nucleic acid 
molecule encodes at least a portion of a plant retroelement reverse 
transcriptase and comprises a nucleic acid sequence selected from the 
group consisting of: 

(a) a nucleic acid sequence having more than 95% identity to a 
nucleic acid sequence selected from the group consisting of even- 
numbered SEQ ID NOs inclusive from SEQ ID NO 42 to SEQ ID 
NO 164, wherein said identity can be determined using the DNAsis 
computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
having more than 95% identity to an amino acid sequence selected 
from the group consisting of odd-numbered SEQ ID NOs inclusive 
from SEQ ID NO 43 through SEQ ID NO 165, wherein said identity 
can be determined using the DNAsis computer program and default 
parameters; 

(c) a nucleic acid sequence which encodes an allelic variant of a 
nucleic acid sequence selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b). 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b). 

10. A seed comprising a nucleic acid of claim 9. 
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11. 



A plant comprising a nucleic acid of claim 9. 



12. A nucleic acid molecule of 9, which comprises gag, pol and env 
genes and which comprises adenine-thymidine-guanidine as the gag 
gene start codon. 

13. A nucleic acid molecule of claim 9, which further comprises SEQ ID NO 5. 

14. A method to impart agronomically-significant characteristics to at 
least one plant cell, comprising: 

contacting a nucleic acid molecule of claim 9 to at least one plant cell 
under conditions sufficient to allow at least one agronomically- 
significant nucleic acid molecule to enter said cell. 

15. An isolated nucleic acid molecule, wherein said nucleic acid 
molecule encodes at least a portion of a plant retroelement reverse 
transcriptase and comprises a nucleic acid sequence selected from the 
group consisting of: 

(a) a nucleic acid sequence selected from the group consisting of 
even-numbered SEQ ID NOs inclusive from SEQ ID NO 42 to SEQ 
ID NO 164, wherein said identity can be determined using the 
DNAsis computer program and default parameters; 

(b) a nucleic acid sequence which encodes an amino acid sequence 
selected from the group consisting of odd-numbered SEQ ID NOs 
inclusive from SEQ ID NO 43 through SEQ ID NO 165, wherein said 
identity can be determined using the DNAsis computer program and 
default parameters; 
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(c) a nucleic acid sequence which encodes an allelic variant of a 
nucleic acid sequence selected from the group consisting of: a nucleic 
acid sequence of (a); a nucleic acid sequence of (b). 

(d) a nucleic acid sequence fully complementary to a nucleic acid 
sequence selected from the group consisting of: a nucleic acid 
sequence of (a); a nucleic acid sequence of (b). 

16. A seed comprising a nucleic acid of claim 15. 

17. A plant comprising a nucleic acid of claim 15. 

18. A nucleic acid molecule of 15, which comprises gag, pol and env 
genes and which comprises adenine-thymidine-guanidine as the gag 
gene start codon. 

19. A nucleic acid molecule of claim 9, which further comprises SEQ ID NO 5. 

20. A method to impart agronomically-significant characteristics to at 
least one plant cell, comprising: 

contacting a nucleic acid molecule of claim 15 to at least one plant 
cell under conditions sufficient to allow at least one agronomically- 
significant nucleic acid molecule to enter said cell. 

21. A nucleic acid molecule of claim 1 5, which further comprises at least 
one nucleic acid sequence which encodes at least one agronomically- 
significant characteristic. 

22. A nucleic acid molecule of claim 21, wherein the agronomically- 
significant characteristic is selected from the group consisting of: 
male sterility; self-incompatibility; foreign organism resistance; 
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improved biosynthetic pathways; environmental tolerance; 
photosynthetic pathways; and nutrient content. 

23. A nucleic acid molecule of claim 21, wherein the agronomically 
significant characteristic is selected from the group consisting of: 
fruit ripening; oil biosynthesis; pigment biosynthesis; seed formation; 
starch metabolism; salt tolerance; cold/frost tolerance; drought 
tolerance; tolerance to anaerobic conditions; protein content; 
carbohydrate content (including sugars and starches); amino acid 
content; and fatty acid content. 

24. An isolated plant retroviral particle comprising a nucleic acid 
molecule of claim 15. 

25 . A plant of claim 1 7, which plant is selected from the group consisting 
of: soybean; maize; sugar cane; beet; tobacco; wheat; barley; poppy; 
rape; sunflower; alfalfa; sorghum; rose; carnation; gerbera; carrot; 
tomato; lettuce; chicory; pepper; melon; cabbage; oat; rye; cotton; 
flax; potato; pine; walnut; citrus (including oranges, grapefruit etc.); 
hemp; oak; rice; petunia; orchids; Arabidopsis; broccoli; cauliflower; 
brussel sprouts; onion; garlic; leek; squash; pumpkin; celery; pea; 
bean (including various legumes); strawberries; grapes; apples; pears; 
peaches; banana; palm; cocoa; cucumber; pineapple; apricot; plum; 
sugar beet; lawn grasses; maple; triticale; safflower; peanut; and 
olive. 
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ABSTRACT OF THE DISCLOSURE 



The present invention provides plant retroelements useful as molecular tools. 
In one embodiment, the present invention provides nucleic acids encoding gag, pol 
and/or env genes of plant retroelements. The elements can be used, among other 
uses, as building blocks of other constructs, tools to find other nucleic acid sequences 
and tools to transfer nucleic acid into cells. 
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SEQUENCE LISTING 



<110> Wright, David A. 

Voytas, Daniel F. 

<120> Plant Retroelements and Methods Related Thereto 

<130> P-1065A 

<140> 
<141> 

<150> 60/087125 
<151> 1998-05-29 

<150> 09/322478 
<151> 1999-05-28 

<160> 165 

<170> Patentln Ver. 2.1 

<210> 1 

<211> 18 

<212> DNA 

<213> Glycine max 

<400> 1 

tggcgccgtt gccaattg 

<210> 2 

<211> 18 

<212> DNA 

<213> Glycine max 

<400> 2 

tggcgccgtt gtcgggga 

<210> 3 

<211> 6 

<212> DNA 

<213> Glycine max 

<400> 3 
ttgggg 



<210> 4 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 4 

Met Ala Ser Arg Lys Arg Lys 
1 5 



<210> 5 
<211> 1263 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 5 

atggcctccc gtaaacgcaa agctgtgccc acacccgggg aagcgtccaa ctgggactct 60 
tcacgtttca ctttcgagat tgcttggcac agataccagg atagcattca gctccggaac 120 
atccttccag agaggaatgt agagcttgga ccagggatgt ttgatgagtt cctgcaggaa 180 
ctccagaggc tcagatggga ccaggttctg acccgacttc cagagaagtg gattgatgtt 240 
gctctggtga aggagtttta ctccaaccta tatgatccag aggaccacag tccgaagttt 300 
tggagtgttc gaggacaggt tgtgagattt gatgctgaga cgattaatga tttcctcgac 3 60 
accccggtca tcttggcaga gggagaggat tatccagcct actctcagta cctcagcact 42 0 
cctccagacc atgatgccat cctttccgct ctgtgtactc cagggggacg atttgttctg 480 
aatgttgata gtgccccctg gaagctgctg cggaaggatc tgatgacgct cgcgcagaca 540 
tggagtgtgc tctcttattt taaccttgca ctgacttttc acacttctga tattaatgtt 600 
gacagggccc gactcaatta tggcttggtg atgaagatgg acctggacgt gggcagcctc 660 
atttctcttc agatcagtca gatcgcccag tccatcactt ccaggcttgg gttcccagcg 72 0 
ttgatcacaa cactgtgtga gattcagggg gttgtctctg ataccctgat ttttgagtca 780 
ctcagtcctg tgatcaacct tgcctacatt aagaagaact gctggaaccc tgccgatcca 84 0 
tctatcacat ttcaggggac ccgccgcacg cgcaccagag cttcggcgtc ggcatctgag 90 0 
gctcctcttc catcccagca tccttctcag cctttttccc agagaccacg gcctccactt 960 
ctatccacct cagcacctcc atacatgcat ggacagatgc tcaggtcctt gtaccagggt 1020 
cagcagatca tcattcagaa cctgtatcga ttgtccctac atttgcagat ggatctgcca 1080 
ctcatgactc cggaggccta tcgtcagcag gtcgccaagc taggagacca gccctccact 114 0 
gacagggggg aagagccttc tggagccgct gctactgagg atcctgccgt tgatgaagac 12 00 
ctcatagctg acttggctgg cgctgattgg agcccatggg cagacttggg cagaggcagc 1260 
tga 1263 
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<210> 6 
<211> 421 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 6 

Met Ala Ser Arg Lys Arg Lys Ala Val Pro Thr Pro Gly Glu Ala Ser 
15 10 15 

Asn Trp Asp Ser Ser Arg Phe Thr Phe Glu lie Ala Trp His Arg Tyr 
20 25 30 

Gin Asp Ser He Gin Leu Arg Asn He Leu Pro Glu Arg Asn Val Glu 
35 40 45 

Leu Gly Pro Gly Met Phe Asp Glu Phe Leu Gin Glu Leu Gin Arg Leu 
50 55 60 

Arg Trp Asp Gin Val Leu Thr Arg Leu Pro Glu Lys Trp He Asp Val 
65 70 75 80 

Ala Leu Val Lys Glu Phe Tyr Ser Asn Leu Tyr Asp Pro Glu Asp His 
85 90 95 

Ser Pro Lys Phe Trp Ser Val Arg Gly Gin Val Val Arg Phe Asp Ala 
100 105 110 

Glu Thr He Asn Asp Phe Leu Asp Thr Pro Val He Leu Ala Glu Gly 
115 120 125 

Glu Asp Tyr Pro Ala Tyr Ser Gin Tyr Leu Ser Thr Pro Pro Asp His 
130 135 140 

Asp Ala He Leu Ser Ala Leu Cys Thr Pro Gly Gly Arg Phe Val Leu 
145 150 155 160 

Asn Val Asp Ser Ala Pro Trp Lys Leu Leu Arg Lys Asp Leu Met Thr 
165 170 175 

Leu Ala Gin Thr Trp Ser Val Leu Ser Tyr Phe Asn Leu Ala Leu Thr 
180 185 190 

Phe His Thr Ser Asp He Asn Val Asp Arg Ala Arg Leu Asn Tyr Gly 
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195 



200 



205 



Leu Val Met Lys Met Asp Leu Asp Val Gly Ser Leu lie Ser Leu Gin 
210 215 220 

lie Ser Gin lie Ala Gin Ser lie Thr Ser Arg Leu Gly Phe Pro Ala 
225 230 235 240 

Leu lie Thr Thr Leu Cys Glu lie Gin Gly Val Val Ser Asp Thr Leu 
245 250 255 

lie Phe Glu Ser Leu Ser Pro Val He Asn Leu Ala Tyr He Lys Lys 
260 265 270 

Asn Cys Trp Asn Pro Ala Asp Pro Ser He Thr Phe Gin Gly Thr Arg 
275 280 285 

Arg Thr Arg Thr Arg Ala Ser Ala Ser Ala Ser Glu Ala Pro Leu Pro 
290 295 300 

Ser Gin His Pro Ser Gin Pro Phe Ser Gin Arg Pro Arg Pro Pro Leu 
305 310 315 320 

Leu Ser Thr Ser Ala Pro Pro Tyr Met His Gly Gin Met Leu Arg Ser 
325 330 335 

Leu Tyr Gin Gly Gin Gin He He He Gin Asn Leu Tyr Arg Leu Ser 
340 345 350 

Leu His Leu Gin Met Asp Leu Pro Leu Met Thr Pro Glu Ala Tyr Arg 
355 360 365 

Gin Gin Val Ala Lys Leu Gly Asp Gin Pro Ser Thr Asp Arg Gly Glu 
370 375 380 

Glu Pro Ser Gly Ala Ala Ala Thr Glu Asp Pro Ala Val Asp Glu Asp 
385 390 395 400 

Leu He Ala Asp Leu Ala Gly Ala Asp Trp Ser Pro Trp Ala Asp Leu 
405 410 415 



Gly Arg Gly Ser Glx 
420 



<210> 7 
<211> 1596 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 7 

atgcgaggta gaactgcatc tggagacgtt gttcctatta acttagaaat tgaagctacg 60 
tgtcggcgta acaacgctgc aagaagaaga agggagcaag acatagaagg aagtagttac 120 
acctcacctc ctccttctcc aaattatgct cagatggacg gggaaccggc acaaagagtc 180 

acactagagg acttctctaa taccaccact cctcagttct ttacaagtat cacaaggccg 24 0 

gaagtccaag cagatctcct tactcaaggg aacctcttcc atggtcttcc aaatgaagat 300 

ccatatgcgc atctagcctc atacatagag atatgcagca ccgttaaaat cgccggagtt 360 

ccaaaagatg cgatactcct taacctcttt tccttttccc tagcaggaga ggcaaaaaga 420 

tggttgcact cctttaaagg caatagctta agaacatggg aagaagtagt ggaaaaattc 480 

ttaaagaagt atttcccaga gtcaaagacc gtcgaacgaa agatggagat ttcttatttc 54 0 

catcaatttc tggatgaatc ccttagcgaa gcactagacc atttccacgg attgctaaga 600 

aaaacaccaa cacacagata cagcgagcca gtacaactaa acatattcat cgatgacttg 66 0 

caactcttaa tcgaaacagc tactagaggg aagatcaagc tgaagactcc cgaagaagcg 72 0 

atggagctcg tcgagaacat ggcggctagc gatcaagcaa tccttcatga tcacacttat 780 

gttcccacaa aaagaagcct cttggagctt agcacgcagg acgcaacttt ggtacaaaac 84 0 

aagctgttga cgaggcagat agaagccctc atcgaaaccc tcagcaagct gcctcaacaa 900 

ttacaagcga taagttcttc ccactcttct gttttgcagg tagaagaatg ccccacatgc 960 

agagggacac atgagcctgg acaatgtgca agccaacaag acccctctcg tgaagtaaat 102 0 

tatataggca tactaaatcg ttacggattt cagggctaca accagggaaa tccatctgga 1080 

ttcaatcaag gggcaacaag atttaatcac gagccaccgg ggtttaatca aggaagaaac 114 0 

ttcatgcaag gctcaagttg gacgaataaa ggaaatcaat ataaggagca aaggaaccaa 12 00 

ccaccatacc agccaccata ccagcaccct agccaaggtc cgaatcagca agaaaagccc 1260 

accaaaatag aggaactgct gctgcaattc atcaaggaga caagatcaca tcaaaagagc 1320 

acggatgcag ccattcggaa tctagaagtt caaatgggcc aactggcgca tgacaaagcc 1380 

gaacggccca ctagaacttt cggtgctaac atggagagaa gaaccccaag gaaggataaa 144 0 

gcagtactga ctagagggca gagaagagcg caggaggagg gtaaggttga aggagaagac 1500 

tggccagaag aaggaaggac agagaagaca gaagaagaag agaaggtggc agaagaacct 1560 
aagcgtacca agagccagag agcaagggaa gccaag 1596 

<210> 8 
<211> 532 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 8 

Met Arg Gly Arg Thr Ala Ser Gly Asp Val Val Pro He Asn Leu Glu 
15 10 15 
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He Glu Ala Thr 
20 

Gin Asp He Glu 
35 

Tyr Ala Gin Met 
50 

Phe Ser Asn Thr 
65 

Glu Val Gin Ala 



Pro Asn Glu Asp 
100 

Ser Thr Val Lys 
115 

Leu Phe Ser Phe 
130 

Phe Lys Gly Asn 
145 

Leu Lys Lys Tyr 



He Ser Tyr Phe 
180 

Asp His Phe His 
195 

Glu Pro Val Gin 
210 

Glu Thr Ala Thr 
225 

Met Glu Leu Val 



Asp His Thr Tyr 
260 



Cys Arg Arg Asn 



Gly Ser Ser Tyr 
40 

Asp Gly Glu Pro 
55 

Thr Thr Pro Gin 

70 

Asp Leu Leu Thr 
85 

Pro Tyr Ala His 



He Ala Gly Val 
120 

Ser Leu Ala Gly 
135 

Ser Leu Arg Thr 
150 

Phe Pro Glu Ser 
165 

His Gin Phe Leu 



Gly Leu Leu Arg 
200 

Leu Asn He Phe 
215 

Arg Gly Lys He 
230 

Glu Asn Met Ala 
245 

Val Pro Thr Lys 



Asn Ala Ala Arg 
25 

Thr Ser Pro Pro 



Ala Gin Arg Val 
60 

Phe Phe Thr Ser 
75 

Gin Gly Asn Leu 
90 

Leu Ala Ser Tyr 
105 

Pro Lys Asp Ala 



Glu Ala Lys Arg 
140 

Trp Glu Glu Val 
155 

Lys Thr Val Glu 
170 

Asp Glu Ser Leu 
185 

Lys Thr Pro Thr 



He Asp Asp Leu 
220 

Lys Leu Lys Thr 
235 

Ala Ser Asp Gin 
250 

Arg Ser Leu Leu 
265 



Arg Arg Arg Glu 
30 

Pro Ser Pro Asn 
45 

Thr Leu Glu Asp 



He Thr Arg Pro 
80 

Phe His Gly Leu 
95 

He Glu He Cys 
110 

He Leu Leu Asn 
125 

Trp Leu His Ser 



Val Glu Lys Phe 
160 

Arg Lys Met Glu 
175 

Ser Glu Ala Leu 
190 

His Arg Tyr Ser 
205 

Gin Leu Leu He 



Pro Glu Glu Ala 
240 

Ala He Leu His 
255 

Glu Leu Ser Thr 
270 
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Gin Asp Ala Thr Leu Val Gin Asn Lys Leu Leu Thr Arg Gin lie Glu 
275 280 285 



Ala Leu lie Glu Thr Leu Ser Lys Leu Pro Gin Gin Leu Gin Ala lie 
290 295 300 

Ser Ser Ser His Ser Ser Val Leu Gin Val Glu Glu Cys Pro Thr Cys 
305 310 315 320 

Arg Gly Thr His Glu Pro Gly Gin Cys Ala Ser Gin Gin Asp Pro Ser 
325 330 335 

Arg Glu Val Asn Tyr lie Gly lie Leu Asn Arg Tyr Gly Phe Gin Gly 
340 345 350 

Tyr Asn Gin Gly Asn Pro Ser Gly Phe Asn Gin Gly Ala Thr Arg Phe 
355 360 365 

Asn His Glu Pro Pro Gly Phe Asn Gin Gly Arg Asn Phe Met Gin Gly 
370 375 380 

Ser Ser Trp Thr Asn Lys Gly Asn Gin Tyr Lys Glu Gin Arg Asn Gin 
385 390 395 400 

Pro Pro Tyr Gin Pro Pro Tyr Gin His Pro Ser Gin Gly Pro Asn Gin 
405 410 415 

Gin Glu Lys Pro Thr Lys lie Glu Glu Leu Leu Leu Gin Phe lie Lys 
420 425 430 

Glu Thr Arg Ser His Gin Lys Ser Thr Asp Ala Ala lie Arg Asn Leu 
435 440 445 

Glu Val Gin Met Gly Gin Leu Ala His Asp Lys Ala Glu Arg Pro Thr 
450 455 460 

Arg Thr Phe Gly Ala Asn Met Glu Arg Arg Thr Pro Arg Lys Asp Lys 
465 470 475 480 

Ala Val Leu Thr Arg Gly Gin Arg Arg Ala Gin Glu Glu Gly Lys Val 
485 490 495 

Glu Gly Glu Asp Trp Pro Glu Glu Gly Arg Thr Glu Lys Thr Glu Glu 
500 505 510 

Glu Glu Lys Val Ala Glu Glu Pro Lys Arg Thr Lys Ser Gin Arg Ala 
515 520 525 



7 



Arg Glu Ala Lys 
530 



<210> 9 
<211> 603 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 9 

tgtgataaat gccagagaac aggggggata tctcgaagaa atgagatgcc tttgcagaat 60 
atcatggaag tagagatctt tgactgttgg ggcatagact tcatggggcc ttttccttcg 12 0 
tcatacggga atgtctacat cttggtagct gtggattacg tctccaaatg ggtggaagcc 180 
atagccacgc caaaggacga tgccagggta gtgatcaaat ttctgaagaa gaacattttt 24 0 
tcccgttttg gagtcccacg agccttgatt agtgataggg gaacgcactt ctgcaacaat 300 
cagttgaaga aagtcctgga gcactataat gtccgacata aggtggccac accttatcac 360 
cctcagacaa atggccaagc agaaatttct aacagggagc tcaagcgaat cctggaaaag 42 0 
acagttgcat caacaagaaa ggattggtcc ttgaagctcg atgatgctct ctgggcctat 48 0 
aggacagcgt tcaagactcc catcggctta tcaccatttc agctagtgta tgggaaggca 54 0 
tgtcatttac cagtggagct ggagtacaaa gcatattggg ctctcaagtt gctcaacttt 600 
gac 603 



<210> 10 
<211> 201 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 10 

Cys Asp Lys Cys Gin Arg Thr Gly Gly lie Ser Arg Arg Asn Glu Met 
15 10 15 

Pro Leu Gin Asn lie Met Glu Val Glu lie Phe Asp Cys Trp Gly lie 
20 25 30 

Asp Phe Met Gly Pro Phe Pro Ser Ser Tyr Gly Asn Val Tyr lie Leu 
35 40 45 

Val Ala Val Asp Tyr Val Ser Lys Trp Val Glu Ala He Ala Thr Pro 
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50 



55 



60 



Lys Asp Asp Ala 
65 

Ser Arg Phe Gly 



Phe Cys Asn Asn 
100 

His Lys Val Ala 
115 

lie Ser Asn Arg 
130 

Thr Arg Lys Asp 
145 

Arg Thr Ala Phe 



Tyr Gly Lys Ala 
180 

Trp Ala Leu Lys 
195 



Arg Val Val He 
70 

Val Pro Arg Ala 
85 

Gin Leu Lys Lys 



Thr Pro Tyr His 
120 

Glu Leu Lys Arg 
135 

Trp Ser Leu Lys 
150 

Lys Thr Pro He 
165 

Cys His Leu Pro 



Leu Leu Asn Phe 
200 



Lys Phe Leu Lys 
75 

Leu He Ser Asp 
90 

Val Leu Glu His 
105 

Pro Gin Thr Asn 



He Leu Glu Lys 
140 

Leu Asp Asp Ala 
155 

Gly Leu Ser Pro 
170 

Val Glu Leu Glu 
185 

Asp 



Lys Asn He Phe 
80 

Arg Gly Thr His 
95 

Tyr Asn Val Arg 
110 

Gly Gin Ala Glu 
125 

Thr Val Ala Ser 



Leu Trp Ala Tyr 
160 

Phe Gin Leu Val 
175 

Tyr Lys Ala Tyr 
190 



<210> 11 
<211> 600 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 11 

ttggaggctg ggctcatata ccccatctct gacagcgctt gggtaagccc agtacaggtg 60 
gttcccaaga aaggtggaat gacagtggta cgagatgaga ggaatgactt gataccaaca 120 
cgaactgtca ctggttggcg aatgtgtatc gactatcgca agctgaatga agccacacgg 18 0 
aaggaccatt tccccttacc tttcatggat cagatgctgg agagacttgc agggcaggca 24 0 
tactactgtt tcttggatgg atactcggga tacaaccaga tcgcggtaga ccccagagat 3 00 
caggagaaga cggcctttac atgccccttt ggcgtctttg cttacagaag gatgccattc 360 
gggttatgta atgcaccagc cacatttcag aggtgcatgc tggccatttt ttcagacatg 42 0 
gtggagaaaa gcatcgaggt atttatggac gacttctcgg tttttggacc ctcatttgac 480 
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agctgtttga ggaacctaga gagggtactt cagaggtgcg aagagactaa cttggtactg 540 
aattgggaaa agtgtcattt catggttcga gagggcatag tcctaggcca caagatctca 600 

<210> 12 
<211> 200 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 12 

Leu Glu Ala Gly Leu lie Tyr Pro lie Ser Asp Ser Ala Trp Val Ser 
15 10 15 

Pro Val Gin Val Val Pro Lys Lys Gly Gly Met Thr Val Val Arg Asp 
20 25 30 

Glu Arg Asn Asp Leu He Pro Thr Arg Thr Val Thr Gly Trp Arg Met 
35 40 45 

Cys He Asp Tyr Arg Lys Leu Asn Glu Ala Thr Arg Lys Asp His Phe 
50 55 60 

Pro Leu Pro Phe Met Asp Gin Met Leu Glu Arg Leu Ala Gly Gin Ala 
65 70 75 80 

Tyr Tyr Cys Phe Leu Asp Gly Tyr Ser Gly Tyr Asn Gin He Ala Val 
85 90 95 

Asp Pro Arg Asp Gin Glu Lys Thr Ala Phe Thr Cys Pro Phe Gly Val 
100 105 110 

Phe Ala Tyr Arg Arg Met Pro Phe Gly Leu Cys Asn Ala Pro Ala Thr 
115 120 125 

Phe Gin Arg Cys Met Leu Ala He Phe Ser Asp Met Val Glu Lys Ser 
130 135 140 

He Glu Val Phe Met Asp Asp Phe Ser Val Phe Gly Pro Ser Phe Asp 
145 150 155 160 

Ser Cys Leu Arg Asn Leu Glu Arg Val Leu Gin Arg Cys Glu Glu Thr 
165 170 175 

Asn Leu Val Leu Asn Trp Glu Lys Cys His Phe Met Val Arg Glu Gly 
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180 



185 



190 



He Val Leu Gly His Lys He Ser 
195 200 



<210> 13 
<211> 858 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 13 

aaggaagaac cactagccct tccacaggat ctcccatatc ctatggcacc caccaagaag 60 
aacaaggagc gttactttgc acgtttcttg gaaatattca aagggttaga aatcactatg 12 0 
ccattcgggg aagccttaca gcagatgccc ctctactcca aatttatgaa agacatcctc 180 
accaagaagg ggaagtatat tgacaacgag aatattgtgg taggaggcaa ttgcagtgcg 24 0 
ataatacaaa ggattctacc caagaagttt aaagaccccg gaagtgttac catcccgtgc 3 00 
accattggga aggaagccgt aaacaaggcc ctcattgatc taggagcaag tatcaatctg 360 
atgcccttgt caatgtgcaa aagaattggg aatttgaaga tagatcccac caagatgacg 42 0 
cttcaactgg cagaccgctc aatcacaagg ccatatgggg tggtagaaga tgtcctggtc 480 
aaggtacgcc acttcacttt tccggtggac tttgttatca tggatatcga agaagacact 54 0 
gagattcccc ttatcttagg cagacccttc atgctgactg ccaactgtgt ggtggatatg 600 
gggaaaggga acttagagtt gactattgat aatcagaaga tcacctttga ccttatcaag 660 
gcaatgaagt acccacagga gggttggaag tgcttcagaa tagaggagat tgatgaggaa 72 0 
gatgtcagtt ttctcgagac accaaagact tcgctagaaa aagcaatggt aaatcattta 78 0 
gactgtctaa ccagtgaaga ggaagaagat ctgaaggctt gcttggaaaa cttggatcaa 840 
gaagacagta ttcctgag 858 



<210> 14 
<211> 286 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 14 

Lys Glu Glu Pro Leu Ala Leu Pro Gin Asp Leu Pro Tyr Pro Met Ala 
15 10 15 

Pro Thr Lys Lys Asn Lys Glu Arg Tyr Phe Ala Arg Phe Leu Glu He 
20 25 30 
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Phe Lys Gly Leu 
35 

Met Pro Leu Tyr 
50 

Lys Tyr lie Asp 
65 

lie lie Gin Arg 

Thr lie Pro Cys 
100 

Asp Leu Gly Ala 
115 

lie Gly Asn Leu 
130 

Asp Arg Ser lie 
145 

Lys Val Arg His 



Glu Glu Asp Thr 
180 

Thr Ala Asn Cys 
195 

lie Asp Asn Gin 
210 

Pro Gin Glu Gly 
225 

Asp Val Ser Phe 



Val Asn His Leu 
260 

Ala Cys Leu Glu 
275 



Glu lie Thr Met 
40 

Ser Lys Phe Met 
55 

Asn Glu Asn lie 
70 

lie Leu Pro Lys 
85 

Thr He Gly Lys 



Ser He Asn Leu 
120 

Lys He Asp Pro 
135 

Thr Arg Pro Tyr 
150 

Phe Thr Phe Pro 
165 

Glu He Pro Leu 



Val Val Asp Met 
200 

Lys He Thr Phe 
215 

Trp Lys Cys Phe 
230 

Leu Glu Thr Pro 
245 

Asp Cys Leu Thr 



Asn Leu Asp Gin 
280 



Pro Phe Gly Glu 



Lys Asp He Leu 
60 

Val Val Gly Gly 
75 

Lys Phe Lys Asp 
90 

Glu Ala Val Asn 
105 

Met Pro Leu Ser 



Thr Lys Met Thr 
140 

Gly Val Val Glu 
155 

Val Asp Phe Val 
170 

He Leu Gly Arg 
185 

Gly Lys Gly Asn 



Asp Leu He Lys 
220 

Arg He Glu Glu 
235 

Lys Thr Ser Leu 
250 

Ser Glu Glu Glu 
265 

Glu Asp Ser He 



Ala Leu Gin Gin 
45 

Thr Lys Lys Gly 



Asn Cys Ser Ala 
80 

Pro Gly Ser Val 
95 

Lys Ala Leu He 
110 

Met Cys Lys Arg 
125 

Leu Gin Leu Ala 



Asp Val Leu Val 
160 

He Met Asp He 
175 

Pro Phe Met Leu 
190 

Leu Glu Leu Thr 
205 

Ala Met Lys Tyr 



He Asp Glu Glu 
240 

Glu Lys Ala Met 
255 

Glu Asp Leu Lys 
270 

Pro Glu 
285 
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<210> 15 
<211> 192 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroeletnent sequence 

<400> 15 

tttgaactaa tgtgtgatgc cagtgattat gcagtaggag cagttttggg acagaggaaa 60 
gacaaggtat ttcacgccat ctattatgct agcaaggtcc tgaatgaagc acagttgaat 120 
tatgcaacca cagaaaagga gatgctagcc attgtctttg ccttggagaa gttcaggtca 180 
tacttgatag gg 192 



<210> 16 
<211> 64 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 16 

Phe Glu Leu Met Cys Asp Ala Ser Asp Tyr Ala Val Gly Ala Val Leu 
15 10 15 

Gly Gin Arg Lys Asp Lys Val Phe His Ala He Tyr Tyr Ala Ser Lys 
20 25 30 

Val Leu Asn Glu Ala Gin Leu Asn Tyr Ala Thr Thr Glu Lys Glu Met 
35 40 45 

Leu Ala He Val Phe Ala Leu Glu Lys Phe Arg Ser Tyr Leu He Gly 
50 55 60 



<210> 17 
<211> 12286 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 



<400> 17 

tgataactgc taaataattg tgaattaata 
ttaattattt agcagttatt tgtgattaaa 
gccatagata tgaaaactga aggtacaaca 
aataaaatct gaagcagacc cagcccaaca 
ttgcaaggca gcacaggcac taagcgaggc 
tgcgctaagc gcgaggcaca cgctaagcgc 
cagcatgcgc taagcgcgcc tacgaaggcc 
atccaagcca agggagaatg tacaccttgc 
tgagctctcc cttttctctc tatattcttt 
agttgtaaag cccctcaatg gccatgagtg 
taaaaagcca atgatgtatg gtgtacttca 
aggttttatg ttctaattct ttccttttta 
gttttattcg ctcgggagag ggtatttcct 
agttttaggg gttatacgct tggtaaaggg 
atcgtcgggc tagcattgct aggcatagaa 
atctagaatt taaccttaat gcattttaat 
taggtagtta aaataggctt gtcatcgtga 
tgggtagaac taattcaact gcattggtaa 
aattaggttt gtccggtctt ggcattttca 
tagcaacaat ttattcttat gcctattcct 
gaagagtatt caataaagtg caataaaatc 
attactactt agaacgattt ggtacacttg 
tgtcggggat tttgttctcg cacttaattg 
ttcttttctt ggctcattct tttattattc 
tcttctccca taaattgcac gggtagtgcc 
ctggagacgt tgttcctatt aacttagaaa 
caagaagaag aagggagcaa gacatagaag 
caaattatgc tcagatggac ggggaaccgg 
ataccaccac tcctcagttc tttacaagta 
ttactcaagg gaacctcttc catggtcttc 
catacataga gatatgcagc accgttaaaa 
ttaacctctt ttccttttcc ctagcaggag 
gcaatagctt aagaacatgg gaagaagtag 
agtcaaagac cgtcgaacga aagatggaga 
cccttagcga agcactagac catttccacg 
acagcgagcc agtacaacta aacatattca 
ctactagagg gaagatcaag ctgaagactc 
tggcggctag cgatcaagca atccttcatg 
tcttggagct tagcacgcag gacgcaactt 
tagaagccct catcgaaacc ctcagcaagc 
cccactcttc tgttttgcag gtagaagaat 
gacaatgtgc aagccaacaa gacccctctc 



gtagaaaatt agtcaaattt tggcttaaaa 60 
agttagaaaa gcaattaagt tgaatttttg 12 0 
agcaaaaggc agcagaaagt gaagaaaaag 180 
cgcgccctta gcgcgcgtca cgcgctaagc 24 0 
gttaagcacg aagatgcagg attcgttacg 300 
gcgatccaac agaagcacac gctaagcctg 360 
caaagcccat ttctacacct ataaatagag 42 0 
ctcagagcac ttctctcagc attccaagct 480 
gcttttatta tccattcttt ctttcacccc 540 
gttaatcccc tagctacggc ctggtaggcc 600 
agagttatca atgcaaagag gattcattcc 660 
tcttgcattt atgtcttaaa tttctgttgg 720 
aataagggtt taagaagtaa tgcatgcatc 780 
taacacctaa tagaacaaat taagaaaagg 84 0 
tgatggccca atgcccatgc atttagcaac 900 
tattgaatct tcacaaaggc atttgggaga 96 0 
ggcatcaagg gcaagtaaaa ttaatagatg 102 0 
tgaacatcat aaattcattc atcgtaggcc 1080 
tcaattgtct tcctaaatta tttgatctaa 1140 
gtttttacta tttactttta cttacaaatt 1200 
cctatggaaa cgatactcgg acttccgaga 1260 
tcaaacacct caacaagttt ttggcgccgt 1320 
ccatactata ttagtttgta agcttaattc 1380 
tttactttac tttttcttct atcctttctt 1440 
tttttgtttt tatgcgaggt agaactgcat 1500 
ttgaagctac gtgtcggcgt aacaacgctg 1560 
gaagtagtta cacctcacct cctccttctc 1620 
cacaaagagt cacactagag gacttctcta 1680 
tcacaaggcc ggaagtccaa gcagatctcc 174 0 
caaatgaaga tccatatgcg catctagcct 1800 
tcgccggagt tccaaaagat gcgatactcc 1860 
aggcaaaaag atggttgcac tcctttaaag 1920 
tggaaaaatt cttaaagaag tatttcccag 1980 
tttcttattt ccatcaattt ctggatgaat 2040 
gattgctaag aaaaacacca acacacagat 2100 
tcgatgactt gcaactctta atcgaaacag 2160 
ccgaagaagc gatggagctc gtcgagaaca 222 0 
atcacactta tgttcccaca aaaagaagcc 2280 
tggtacaaaa caagctgttg acgaggcaga 234 0 
tgcctcaaca attacaagcg ataagttctt 2400 
gccccacatg cagagggaca catgagcctg 2460 
gtgaagtaaa ttatataggc atactaaatc 2520 
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gttacggatt tcagggctac aaccagggaa atccatctgg attcaatcaa ggggcaacaa 2580 
gatttaatca cgagccaccg gggtttaatc aaggaagaaa cttcatgcaa ggctcaagtt 2640 
ggacgaataa aggaaatcaa tataaggagc aaaggaacca accaccatac cagccaccat 2700 
accagcaccc tagccaaggt ccgaatcagc aagaaaagcc caccaaaata gaggaactgc 2760 
tgctgcaatt catcaaggag acaagatcac atcaaaagag cacggatgca gccattcgga 2820 
atctagaagt tcaaatgggc caactggcgc atgacaaagc cgaacggccc actagaactt 28 80 
tcggtgctaa catggagaga agaaccccaa ggaaggataa agcagtactg actagagggc 2940 
agagaagagc gcaggaggag ggtaaggttg aaggagaaga ctggccagaa gaaggaagga 3 000 
cagagaagac agaagaagaa gagaaggtgg cagaagaacc taagcgtacc aagagccaga 3 060 
gagcaaggga agccaagaag gaagaaccac tagcccttcc acaggatctc ccatatccta 312 0 
tggcacccac caagaagaac aaggagcgtt actttgcacg tttcttggaa atattcaaag 3180 
ggttagaaat cactatgcca ttcggggaag ccttacagca gatgcccctc tactccaaat 3240 
ttatgaaaga catcctcacc aagaagggga agtatattga caacgagaat attgtggtag 33 00 
gaggcaattg cagtgcgata atacaaagga ttctacccaa gaagtttaaa gaccccggaa 33 60 
gtgttaccat cccgtgcacc attgggaagg aagccgtaaa caaggccctc attgatctag 3420 
gagcaagtat caatctgatg cccttgtcaa tgtgcaaaag aattgggaat ttgaagatag 34 80 
atcccaccaa gatgacgctt caactggcag accgctcaat cacaaggcca tatggggtgg 3540 
tagaagatgt cctggtcaag gtacgccact tcacttttcc ggtggacttt gttatcatgg 3 6 00 
atatcgaaga agacactgag attcccctta tcttaggcag acccttcatg ctgactgcca 3660 
actgtgtggt ggatatgggg aaagggaact tagagttgac tattgataat cagaagatca 3 720 
cctttgacct tatcaaggca atgaagtacc cacaggaggg ttggaagtgc ttcagaatag 3 78 0 
aggagattga tgaggaagat gtcagttttc tcgagacacc aaagacttcg ctagaaaaag 3 840 
caatggtaaa tcatttagac tgtctaacca gtgaagagga agaagatctg aaggcttgct 3 900 
tggaaaactt ggatcaagaa gacagtattc ctgagggaga agccaatttc gaggagctag 3 960 
agaaggaagt tccgtctgag aagccgaaga tagagttgaa gatattgcct gatcatctga 4 02 0 
agtatgtgtt cttggaggaa gataaaccta tagtgatcag taacgcactc acaacagagg 4080 
aggaaaatag gttggtagat gtcctcaaga aacacaggga agcaattgga tggcacatat 4140 
cggatctcaa ggaaattagc cctgcttact gcatgcacag gataatgatg gaagaggact 42 00 
acaagccagt ccgacaaccc cagaggcggc tgaatccaac aatgaaggaa gaggtaagaa 4260 
aggaggtact caagctcttg gaggctgggc tcatataccc catctctgac agcgcttggg 432 0 
taagcccagt acaggtggtt cccaagaaag gtggaatgac agtggtacga gatgagagga 43 80 
atgacttgat accaacacga actgtcactg gttggcgaat gtgtatcgac tatcgcaagc 444 0 
tgaatgaagc cacacggaag gaccatttcc ccttaccttt catggatcag atgctggaga 45 00 
gacttgcagg gcaggcatac tactgtttct tggatggata ctcgggatac aaccagatcg 4560 
cggtagaccc cagagatcag gagaagacgg cctttacatg cccctttggc gtctttgctt 4620 
acagaaggat gccattcggg ttatgtaatg caccagccac atttcagagg tgcatgctgg 4680 
ccattttttc agacatggtg gagaaaagca tcgaggtatt tatggacgac ttctcggttt 4740 
ttggaccctc atttgacagc tgtttgagga acctagagag ggtacttcag aggtgcgaag 4800 
agactaactt ggtactgaat tgggaaaagt gtcatttcat ggttcgagag ggcatagtcc 4 860 
taggccacaa gatctcagcc agagggattg aggttgatcg ggcaaagata gacgtcatcg 4 92 0 
agaagctgcc accaccactg aatgttaaag gggttagaag tttcttaggg catgcaggtt 4 980 
tctacaggag gtttatcaag gacttctcga agattgccag gcccttaagc aatctgttga 5 040 
ataaagacgt ggcttttgtg tttgatgaag aatgtttagc agcatttcaa tcactgaaga 5100 
ataagctcgt cactgcaccc gtaatgattg cacccgactg gaataaagat tttgaactaa 5160 
tgtgtgatgc cagtgattat gcagtaggag cagttttggg acagaggaaa gacaaggtat 5220 
ttcacgccat ctattatgct agcaaggtcc tgaatgaagc acagttgaat tatgcaacca 5280 
cagaaaagga gatgctagcc attgtctttg ccttggagaa gttcaggtca tacttgatag 534 0 
ggtcgagggt catcatttac acagatcatg ctgccatcaa gcacctgctc gccaaaacag 54 00 
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actcaaagcc gaggttgatt agatgggtcc tgctgttaca agaatttgac atcatcatca 546 0 
aggacaagaa aggatccgag aatgtggtag ccaatcatct atctcgatta aagaatgaag 552 0 
aagtcaccaa ggaagaacca gaggtaaaag gtgaatttcc tgatgagttt cttttgcagg 5580 
ttaccgaaag accttggttt gcagacatgg ctaactacaa agccacggga gtcattccag 564 0 
aggagtttaa ttggagtcag aggaagaaat tcttgcacga tgcacgcttc tatgtgtggg 5700 
atgatcctca tttgttcaag gcaggagcag ataatttatt aaggagatgc gtcacaaagg 5760 
aggaagcacg gagcattctt tggcactgcc acagttcacc ctatggcgga caccacagtg 5820 
gggacagaac agcagcaaaa gtgctacaat caggtttttt ctggccctct atttttaaag 5880 
atgctcacga gtttgtgcgt tgttgtgata aatgccagag aacagggggg atatctcgaa 5940 
gaaatgagat gcctttgcag aatatcatgg aagtagagat ctttgactgt tggggcatag 6000 
acttcatggg gccttttcct tcgtcatacg ggaatgtcta catcttggta gctgtggatt 6060 
acgtctccaa atgggtggaa gccatagcca cgccaaagga cgatgccagg gtagtgatca 6120 
aatttctgaa gaagaacatt ttttcccgtt ttggagtccc acgagccttg attagtgata 618 0 
ggggaacgca cttctgcaac aatcagttga agaaagtcct ggagcactat aatgtccgac 624 0 
ataaggtggc cacaccttat caccctcaga caaatggcca agcagaaatt tctaacaggg 63 00 
agctcaagcg aatcctggaa aagacagttg catcaacaag aaaggattgg tccttgaagc 6360 
tcgatgatgc tctctgggcc tataggacag cgttcaagac tcccatcggc ttatcaccat 6420 
ttcagctagt gtatgggaag gcatgtcatt taccagtgga gctggagtac aaagcatatt 648 0 
gggctctcaa gttgctcaac tttgacaaca acgcatgcgg ggaaaagagg aagctacagc 654 0 
tgctggaatt agaagagatg agactgaatg cctacgagtc atccaaaatt tacaaggaaa 6600 
agatgaaggc atatcatgac aagaagctac tgaggaaaga attccagcca gggcagcagg 6660 
tattactctt taactcaagg ctaaggctat tcccaggtaa gctgaagtcc aagtggtcag 672 0 
ggccattcat aatcaaagaa gtcagacctt acggagcagt agaattggtg gaccctagag 6780 
aagaggactt tgagaagaaa tggatcgtca atggacagcg cttgaagcct tataacggag 6840 
gacaactaga gcgattgacg accatcatct acttaaatga cccttgagaa ggcctactgt 6900 
ctagctaaag acaataaact aagcgctggt tgggaggcaa cccaacatat tttgtaaaaa 6960 
tgtagttatc tttattctat gtaaaaaaaa aaaaaaagcc caataggtgc aaataggaaa 7020 
caggaggtgc aaaaagcaaa ggcccaacag gtgaagacaa caataggagg ggtgccaata 7080 
gcaaaactga agtgggctgc acgaagccac gcgcccaatt cttggtcttt tcacacaaaa 7140 
caatcactaa cgaaggtaaa gaattgcttt gtatggatgt tgttatgaat gcacaggtaa 72 00 
cagcacgcta agccctgctc gacgcttagc caatgaagac ggattgaagg ccataacgac 72 60 
gagctcgtta agcgtgacga agcacgctaa gcaggcgcct gacaggacga gaaagcaaag 732 0 
cgcgcgctta gccggcactt ccgcgctaag cgcgctcatg aacatcactg aacgcgctaa 73 8 0 
acgtgtgcca gaggcgctaa acgcgtgcca gaggcgctaa acgcgtgcat tagtcacagc 744 0 
aggatggtgc taagcgcggg gttgggcctc agggcccatc aaccctcgca ccttacttgt 7500 
tgcaccccta tttctactat tcccactccc ttctaatttc tttttgcacc ccccttcttt 7560 
actgactgca cctctatttt gattactttt tgcacccccc ctgattgcta acttcagact 7620 
atctttcttg ttttttgttt ttttggtttt ttggtcagat ggcctcccgt aaacgcaaag 7680 
ctgtgcccac acccggggaa gcgtccaact gggactcttc acgtttcact ttcgagattg 7740 
cttggcacag ataccaggat agcattcagc tccggaacat ccttccagag aggaatgtag 7800 
agcttggacc agggatgttt gatgagttcc tgcaggaact ccagaggctc agatgggacc 7860 
aggttctgac ccgacttcca gagaagtgga ttgatgttgc tctggtgaag gagttttact 792 0 
ccaacctata tgatccagag gaccacagtc cgaagttttg gagtgttcga ggacaggttg 7980 
tgagatttga tgctgagacg attaatgatt tcctcgacac cccggtcatc ttggcagagg 804 0 
gagaggatta tccagcctac tctcagtacc tcagcactcc tccagaccat gatgccatcc 8100 
tttccgctct gtgtactcca gggggacgat ttgttctgaa tgttgatagt gccccctgga 8160 
agctgctgcg gaaggatctg atgacgctcg cgcagacatg gagtgtgctc tcttatttta 8220 
accttgcact gacttttcac acttctgata ttaatgttga cagggcccga ctcaattatg 8280 
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gcttggtgat gaagatggac ctggacgtgg gcagcctcat ttctcttcag atcagtcaga 834 0 
tcgcccagtc catcacttcc aggcttgggt tcccagcgtt gatcacaaca ctgtgtgaga 8400 
ttcagggggt tgtctctgat accctgattt ttgagtcact cagtcctgtg atcaaccttg 8460 
cctacattaa gaagaactgc tggaaccctg ccgatccatc tatcacattt caggggaccc 8520 
gccgcacgcg caccagagct tcggcgtcgg catctgaggc tcctcttcca tcccagcatc 8580 
cttctcagcc tttttcccag agaccacggc ctccacttct atccacctca gcacctccat 8640 
acatgcatgg acagatgctc aggtccttgt accagggtca gcagatcatc attcagaacc 870 0 
tgtatcgatt gtccctacat ttgcagatgg atctgccact catgactccg gaggcctatc 8760 
gtcagcaggt cgccaagcta ggagaccagc cctccactga caggggggaa gagccttctg 8820 
gagccgctgc tactgaggat cctgccgttg atgaagacct catagctgac ttggctggcg 8880 
ctgattggag cccatgggca gacttgggca gaggcagctg atcttatgct ttaatgtttt 8 94 0 
cttttatatt atgtttgtgt tctcttttat gttttatgtt atgtttttat gtagtctgtt 9000 
tggtaattaa aaagaggtag tagtaaaaat attagtattt cagtatgtgt tttctgagta 9060 
ataagtgcat gataactcaa gcaatcataa ttctttagct tgttcagaaa ggttcaacac 9120 
ttgagatgcc actgatcctt ggagaaacac tggttctgga agcaaaagtc aggtcaagaa 9180 
atggaacatg aatagcacag agtggaaagg ttagcttgat ggaacaaggt cataactggt 9240 
acgccgaata cttgtttaag tccctgtgag catggttgtc aaactctaga gtcaactcat 93 00 
agactctcat gagtttaaga gtttacttca gtcccgcgag ttgactcgga agcaaactcg 93 60 
cttttgagca aactcgtgga ctcggagtga actcatgtaa actcgtaaga gtctacgagt 942 0 
tgactctaga gtttgacaac catgcataag tgttcaaaat taaagcattt aaataattaa 9480 
aaaaagcaca aatgtcttca aagaagcatg ttcaatcctc taataggatc atcttcatga 954 0 
atatcatcac tttcatcatc atctccatct ccatcatcat catcaaggtc ttcctcagat 9600 
tgtgcatcat cattaggttc cacaaagatt aaattatcta gatcaaaagc ttaaaataga 9660 
tatcaaatat gctatattag aaatagttaa aacttaaaat aatacacaag caaattttaa 9720 
atatgagaaa gttcagaaat tatacctttt cttggtgtta ttaaagtttc attttatctt 9780 
ctcttttgca ttttccatct cctcacatat gaaaagcata attctattga atttcagtaa 9840 
caagtttgat ccaactccaa cattgtaagg tcagttgttg tgttttgtaa tagactaata 9900 
tgaagtatga agtatgaact atgaacttat tgtcatctgt ttgcaaattg gtgcattttg 9960 
aatatattta cttattatcc attttttttt ttttacgaag tagactctca cgagtctgcg 10020 
tagactctcg atatcgataa ccttgccgat gagagtgtga acttaattgt gagagaaaat 10080 
gcctattttt aagttcctgg ttttgcatca ttcttagacg gttagaatag ttacttaagg 10140 
tggatatgat caaggccatg tttgtttgtt tacctactta gccaaaaagc caacctaaca 10200 
tagttttacc ccttgcaccc atgattgagc caactgatta ttttgaatta accttgagcc 10260 
aattaaacaa aatcctgacc ttttaggatt ttaagagagt aaaaatgggt tataaaggtc 10320 
ttaatttggg ggattttggg aaataggtag ccaagacaat aagtacagca cacaaagtag 10380 
gacacctttt acaaacagta ggcccaattt cgaaaaaaaa atgaaaagaa tttaataaag 1044 0 
ggcagaaaca aaagagcaag agaggtgtca aaagaaaagt gttgtgggga aataaaaggg 10500 
ctaagtaaaa aggcctaggc agaattggaa atttttgttc tcttttaatc ctaactttga 10560 
atttccaaga aaaaccatga ttttttgtaa gccaggcccc gatacaagcc aataaagtcc 1062 0 
ttagtgatcc accaaaggta actagagata actgtaactg agatgaaatg caaaattttg 10680 
aagtgttact tgcaggttgt tatcaaattg caaacactaa actaggcact tgtgagcaga 1074 0 
gggaaacacc agccttgtga ggaaagtaag gcaagccaaa tttgattgag ttccagatga 10800 
ctaactgatt caattcttct gttgtaatgc tttcatttta agatgttgac agatgcagaa 10860 
aggaccagtg aaagaaggag gaactgagcc attgatagtg ttggaatatt taagaacttg 10920 
cttgagaatt tacttgtttt tggttttctt ggggacaagc aaagtttcat ttggggaatt 10980 
ttgataactg ctaaataatt gtgaattaat agtagaaaat tagtcaaatt ttggcttaaa 11040 
attaattatt tagcagttat ttgtgattaa aagttagaaa agcaattaag ttgaattttt 11100 
ggccatagat atgaaaactg aaggtacaac aagcaaaagg cagcagaaag tgaagaaaaa 11160 
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gaataaaatc tgaagcagac ccagcccaac acgcgccctt agcgcgcgtc acgcgctaag 11220 
cttgcaaggc agcacaggca ctaagcgagg cgttaagcac gaagatgcag gattcgttac 11280 
gtgcgctaag cgcgaggcac acgctaagcg cgcgatccaa cagaagcaca cgctaagcct 1134 0 
gcagcatgcg ctaagcgcgc ctacgaaggc ccaaagccca tttctacacc tataaataga 11400 
gatccaagcc aagggagaat gtacaccttg cctcagagca cttctctcag cattccaagc 11460 
ttgagctctc ccttttctct ctatattctt tgcttttatt atccattctt tctttcaccc 11520 
cagttgtaaa gcccctcaat ggccatgagt ggttaatccc ctagctacgg cctggtaggc 11580 
ctaaaaagcc aatgatgtat ggtgtacttc aagagttatc aatgcaaaga ggattcattc 11640 
caggttttat gttctaattc tttccttttt atcttgcatt tatgtcttaa atttctgttg 11700 
ggttttattc gctcgggaga gggtatttcc taataagggt ttaagaagta atgcatgcat 11760 
cagttttagg ggttatacgc ttggtaaagg gtaacaccta atagaacaaa ttaagaaaag 11820 
gatcgtcggg ctagcattgc taggcataga atgatggccc aatgcccatg catttagcaa 11880 
catctagaat ttaaccttaa tgcattttaa ttattgaatc ttcacaaagg catttgggag 1194 0 
ataggtagtt aaaataggct tgtcatcgtg aggcatcaag ggcaagtaaa attaatagat 12 00 0 
gtgggtagaa ctaattcaac tgcattggta atgaacatca taaattcatt catcgtaggc 12060 
caattaggtt tgtccggtct tggcattttc atcaattgtc ttcctaaatt atttgatcta 1212 0 
atagcaacaa tttattctta tgcctattcc tgtttttact atttactttt acttacaaat 12180 
tgaagagtat tcaataaagt gcaataaaat ccctatggaa acgatactcg gacttccgag 12240 
aattactact tagaacgatt tggtacactt gtcaaacacc tcaaca 1228 6 



<210> 18 
<211> 1802 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plant 
retroelement sequence 

<400> 18 

Met Arg Gly Arg Thr Ala Ser Gly Asp Val Val Pro He Asn Leu Glu 
15 10 15 

He Glu Ala Thr Cys Arg Arg Asn Asn Ala Ala Arg Arg Arg Arg Glu 
20 25 30 

Gin Asp He Glu Gly Ser Ser Tyr Thr Ser Pro Pro Pro Ser Pro Asn 
35 40 45 

Tyr Ala Gin Met Asp Gly Glu Pro Ala Gin Arg Val Thr Leu Glu Asp 
50 55 60 

Phe Ser Asn Thr Thr Thr Pro Gin Phe Phe Thr Ser He Thr Arg Pro 
65 70 75 80 

Glu Val Gin Ala Asp Leu Leu Thr Gin Gly Asn Leu Phe His Gly Leu 
85 90 95 
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Pro Asn Glu Asp Pro Tyr Ala His Leu Ala Ser Tyr lie Glu He Cys 
100 105 110 



Ser Thr Val Lys He Ala Gly Val Pro Lys Asp Ala He Leu Leu Asn 
115 120 125 

Leu Phe Ser Phe Ser Leu Ala Gly Glu Ala Lys Arg Trp Leu His Ser 
130 135 140 

Phe Lys Gly Asn Ser Leu Arg Thr Trp Glu Glu Val Val Glu Lys Phe 
145 150 155 160 

Leu Lys Lys Tyr Phe Pro Glu Ser Lys Thr Val Glu Arg Lys Met Glu 
165 170 175 

He Ser Tyr Phe His Gin Phe Leu Asp Glu Ser Leu Ser Glu Ala Leu 
180 185 190 

Asp His Phe His Gly Leu Leu Arg Lys Thr Pro Thr His Arg Tyr Ser 
195 200 205 

Glu Pro Val Gin Leu Asn He Phe He Asp Asp Leu Gin Leu Leu He 
210 215 220 

Glu Thr Ala Thr Arg Gly Lys He Lys Leu Lys Thr Pro Glu Glu Ala 
225 230 235 240 

Met Glu Leu Val Glu Asn Met Ala Ala Ser Asp Gin Ala He Leu His 
245 250 255 

Asp His Thr Tyr Val Pro Thr Lys Arg Ser Leu Leu Glu Leu Ser Thr 
260 265 270 

Gin Asp Ala Thr Leu Val Gin Asn Lys Leu Leu Thr Arg Gin He Glu 
275 280 285 

Ala Leu He Glu Thr Leu Ser Lys Leu Pro Gin Gin Leu Gin Ala He 
290 295 300 

Ser Ser Ser His Ser Ser Val Leu Gin Val Glu Glu Cys Pro Thr Cys 
305 310 315 320 

Arg Gly Thr His Glu Pro Gly Gin Cys Ala Ser Gin Gin Asp Pro Ser 
325 330 335 

Arg Glu Val Asn Tyr He Gly He Leu Asn Arg Tyr Gly Phe Gin Gly 
340 345 350 
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Tyr Asn Gin Gly Asn Pro Ser Gly Phe Asn Gin Gly Ala Thr Arg Phe 
355 360 365 



Asn His Glu Pro Pro Gly Phe Asn Gin Gly Arg Asn Phe Met Gin Gly 
370 375 380 

Ser Ser Trp Thr Asn Lys Gly Asn Gin Tyr Lys Glu Gin Arg Asn Gin 
385 390 395 400 

Pro Pro Tyr Gin Pro Pro Tyr Gin His Pro Ser Gin Gly Pro Asn Gin 
405 410 415 

Gin Glu Lys Pro Thr Lys lie Glu Glu Leu Leu Leu Gin Phe lie Lys 
420 425 430 

Glu Thr Arg Ser His Gin Lys Ser Thr Asp Ala Ala lie Arg Asn Leu 
435 440 445 

Glu Val Gin Met Gly Gin Leu Ala His Asp Lys Ala Glu Arg Pro Thr 
450 455 460 

Arg Thr Phe Gly Ala Asn Met Glu Arg Arg Thr Pro Arg Lys Asp Lys 
465 470 475 480 

Ala Val Leu Thr Arg Gly Gin Arg Arg Ala Gin Glu Glu Gly Lys Val 
485 490 495 

Glu Gly Glu Asp Trp Pro Glu Glu Gly Arg Thr Glu Lys Thr Glu Glu 
500 505 510 

Glu Glu Lys Val Ala Glu Glu Pro Lys Arg Thr Lys Ser Gin Arg Ala 
515 520 525 

Arg Glu Ala Lys Lys Glu Glu Pro Leu Ala Leu Pro Gin Asp Leu Pro 
530 535 540 

Tyr Pro Met Ala Pro Thr Lys Lys Asn Lys Glu Arg Tyr Phe Ala Arg 
545 550 555 560 

Phe Leu Glu lie Phe Lys Gly Leu Glu lie Thr Met Pro Phe Gly Glu 
565 570 575 

Ala Leu Gin Gin Met Pro Leu Tyr Ser Lys Phe Met Lys Asp lie Leu 
580 585 590 



Thr Lys Lys Gly Lys Tyr He Asp Asn Glu Asn He Val Val Gly Gly 
595 600 605 
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Asn Cys Ser Ala lie 
610 

Pro Gly Ser Val Thr 
625 

Lys Ala Leu lie Asp 
645 

Met Cys Lys Arg lie 
660 

Leu Gin Leu Ala Asp 
675 

Asp Val Leu Val Lys 
690 

lie Met Asp lie Glu 
705 

Pro Phe Met Leu Thr 
725 

Leu Glu Leu Thr lie 
740 

Ala Met Lys Tyr Pro 
755 

lie Asp Glu Glu Asp 
770 

Glu Lys Ala Met Val 
785 

Glu Asp Leu Lys Ala 
805 

Pro Glu Gly Glu Ala 
820 

Glu Lys Pro Lys lie 
835 

Val Phe Leu Glu Glu 
850 



He Gin Arg He Leu Pro 
615 

He Pro Cys Thr He Gly 
630 635 

Leu Gly Ala Ser He Asn 
650 

Gly Asn Leu Lys He Asp 
665 

Arg Ser He Thr Arg Pro 
680 

Val Arg His Phe Thr Phe 
695 

Glu Asp Thr Glu He Pro 
710 715 

Ala Asn Cys Val Val Asp 
730 

Asp Asn Gin Lys He Thr 
745 

Gin Glu Gly Trp Lys Cys 
760 

Val Ser Phe Leu Glu Thr 
775 

Asn His Leu Asp Cys Leu 
790 795 

Cys Leu Glu Asn Leu Asp 
810 

Asn Phe Glu Glu Leu Glu 
825 

Glu Leu Lys He Leu Pro 
840 

Asp Lys Pro He Val He 
855 



Lys Lys Phe Lys Asp 
620 

Lys Glu Ala Val Asn 
640 

Leu Met Pro Leu Ser 
655 

Pro Thr Lys Met Thr 
670 

Tyr Gly Val Val Glu 
685 

Pro Val Asp Phe Val 
700 

Leu He Leu Gly Arg 
720 

Met Gly Lys Gly Asn 
735 

Phe Asp Leu He Lys 
750 

Phe Arg He Glu Glu 
765 

Pro Lys Thr Ser Leu 
780 

Thr Ser Glu Glu Glu 
800 

Gin Glu Asp Ser He 
815 

Lys Glu Val Pro Ser 
830 

Asp His Leu Lys Tyr 
845 

Ser Asn Ala Leu Thr 
860 
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Thr Glu Glu Glu Asn Arg Leu Val Asp Val Leu Lys Lys His Arg Glu 
865 870 875 880 



Ala lie Gly Trp His lie Ser Asp Leu Lys Glu lie Ser Pro Ala Tyr 
885 890 895 

Cys Met His Arg lie Met Met Glu Glu Asp Tyr Lys Pro Val Arg Gin 
900 905 910 

Pro Gin Arg Arg Leu Asn Pro Thr Met Lys Glu Glu Val Arg Lys Glu 
915 920 925 

Val Leu Lys Leu Leu Glu Ala Gly Leu lie Tyr Pro lie Ser Asp Ser 
930 935 940 

Ala Trp Val Ser Pro Val Gin Val Val Pro Lys Lys Gly Gly Met Thr 
945 950 955 960 

Val Val Arg Asp Glu Arg Asn Asp Leu lie Pro Thr Arg Thr Val Thr 
965 970 975 

Gly Trp Arg Met Cys lie Asp Tyr Arg Lys Leu Asn Glu Ala Thr Arg 
980 985 990 

Lys Asp His Phe Pro Leu Pro Phe Met Asp Gin Met Leu Glu Arg Leu 
995 1000 1005 

Ala Gly Gin Ala Tyr Tyr Cys Phe Leu Asp Gly Tyr Ser Gly Tyr Asn 
1010 1015 1020 

Gin lie Ala Val Asp Pro Arg Asp Gin Glu Lys Thr Ala Phe Thr Cys 
1025 1030 1035 1040 

Pro Phe Gly Val Phe Ala Tyr Arg Arg Met Pro Phe Gly Leu Cys Asn 
1045 1050 1055 

Ala Pro Ala Thr Phe Gin Arg Cys Met Leu Ala lie Phe Ser Asp Met 
1060 1065 1070 

Val Glu Lys Ser He Glu Val Phe Met Asp Asp Phe Ser Val Phe Gly 
1075 1080 1085 

Pro Ser Phe Asp Ser Cys Leu Arg Asn Leu Glu Arg Val Leu Gin Arg 
1090 1095 1100 



Cys Glu Glu Thr Asn Leu Val Leu Asn Trp Glu Lys Cys His Phe Met 
1105 1110 1115 1120 
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Val Arg Glu Gly lie Val Leu Gly His Lys He Ser Ala Arg Gly He 
1125 1130 1135 



Glu Val Asp Arg Ala Lys He Asp Val He Glu Lys Leu Pro Pro Pro 
1140 1145 1150 

Leu Asn Val Lys Gly Val Arg Ser Phe Leu Gly His Ala Gly Phe Tyr 
1155 1160 1165 

Arg Arg Phe He Lys Asp Phe Ser Lys He Ala Arg Pro Leu Ser Asn 
1170 1175 1180 

Leu Leu Asn Lys Asp Val Ala Phe Val Phe Asp Glu Glu Cys Leu Ala 
1185 1190 1195 1200 

Ala Phe Gin Ser Leu Lys Asn Lys Leu Val Thr Ala Pro Val Met He 
1205 1210 1215 

Ala Pro Asp Trp Asn Lys Asp Phe Glu Leu Met Cys Asp Ala Ser Asp 
1220 1225 1230 

Tyr Ala Val Gly Ala Val Leu Gly Gin Arg Lys Asp Lys Val Phe His 
1235 1240 1245 

Ala He Tyr Tyr Ala Ser Lys Val Leu Asn Glu Ala Gin Leu Asn Tyr 
1250 1255 1260 

Ala Thr Thr Glu Lys Glu Met Leu Ala He Val Phe Ala Leu Glu Lys 
1265 1270 1275 1280 

Phe Arg Ser Tyr Leu He Gly Ser Arg Val He He Tyr Thr Asp His 
1285 1290 1295 

Ala Ala He Lys His Leu Leu Ala Lys Thr Asp Ser Lys Pro Arg Leu 
1300 1305 1310 

He Arg Trp Val Leu Leu Leu Gin Glu Phe Asp He He He Lys Asp 
1315 1320 1325 

Lys Lys Gly Ser Glu Asn Val Val Ala Asn His Leu Ser Arg Leu Lys 
1330 1335 1340 

Asn Glu Glu Val Thr Lys Glu Glu Pro Glu Val Lys Gly Glu Phe Pro 
1345 1350 1355 1360 

Asp Glu Phe Leu Leu Gin Val Thr Glu Arg Pro Trp Phe Ala Asp Met 
1365 1370 1375 
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Ala Asn Tyr Lys Ala Thr Gly Val lie Pro Glu Glu Phe Asn Trp Ser 
1380 1385 1390 



Gin Arg Lys Lys Phe Leu His Asp Ala Arg Phe Tyr Val Trp Asp Asp 
1395 1400 1405 

Pro His Leu Phe Lys Ala Gly Ala Asp Asn Leu Leu Arg Arg Cys Val 
1410 1415 1420 

Thr Lys Glu Glu Ala Arg Ser lie Leu Trp His Cys His Ser Ser Pro 
1425 1430 1435 1440 

Tyr Gly Gly His His Ser Gly Asp Arg Thr Ala Ala Lys Val Leu Gin 
1445 1450 1455 

Ser Gly Phe Phe Trp Pro Ser lie Phe Lys Asp Ala His Glu Phe Val 
1460 1465 1470 



Arg Cys Cys Asp Lys Cys Gin Arg 
1475 1480 

Glu Met Pro Leu Gin Asn He Met 
1490 1495 

Gly He Asp Phe Met Gly Pro Phe 
1505 1510 

He Leu Val Ala Val Asp Tyr Val 
1525 



Thr Gly Gly He Ser Arg Arg Asn 
1485 

Glu Val Glu He Phe Asp Cys Trp 
1500 

Pro Ser Ser Tyr Gly Asn Val Tyr 
1515 1520 

Ser Lys Trp Val Glu Ala lie Ala 
1530 1535 



Thr Pro Lys Asp Asp Ala Arg Val Val He Lys Phe Leu Lys Lys Asn 
1540 1545 1550 

lie Phe Ser Arg Phe Gly Val Pro Arg Ala Leu He Ser Asp Arg Gly 
1555 1560 1565 

Thr His Phe Cys Asn Asn Gin Leu Lys Lys Val Leu Glu His Tyr Asn 
1570 1575 1580 

Val Arg His Lys Val Ala Thr Pro Tyr His Pro Gin Thr Asn Gly Gin 
1585 1590 1595 1600 

Ala Glu He Ser Asn Arg Glu Leu Lys Arg He Leu Glu Lys Thr Val 
1605 1610 1615 

Ala Ser Thr Arg Lys Asp Trp Ser Leu Lys Leu Asp Asp Ala Leu Trp 
1620 1625 1630 
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Ala Tyr Arg Thr Ala Phe Lys Thr Pro lie Gly Leu Ser Pro Phe Gin 
1635 1640 1645 



Leu Val Tyr Gly Lys Ala Cys His Leu Pro Val Glu Leu Glu Tyr Lys 
1650 1655 1660 

Ala Tyr Trp Ala Leu Lys Leu Leu Asn Phe Asp Asn Asn Ala Cys Gly 
1665 1670 1675 1680 

Glu Lys Arg Lys Leu Gin Leu Leu Glu Leu Glu Glu Met Arg Leu Asn 
1685 1690 1695 

Ala Tyr Glu Ser Ser Lys lie Tyr Lys Glu Lys Met Lys Ala Tyr His 
1700 1705 1710 

Asp Lys Lys Leu Leu Arg Lys Glu Phe Gin Pro Gly Gin Gin Val Leu 
1715 1720 1725 

Leu Phe Asn Ser Arg Leu Arg Leu Phe Pro Gly Lys Leu Lys Ser Lys 
1730 1735 1740 

Trp Ser Gly Pro Phe He He Lys Glu Val Arg Pro Tyr Gly Ala Val 
1745 1750 1755 1760 

Glu Leu Val Asp Pro Arg Glu Glu Asp Phe Glu Lys Lys Trp He Val 
1765 1770 1775 

Asn Gly Gin Arg Leu Lys Pro Tyr Asn Gly Gly Gin Leu Glu Arg Leu 
1780 1785 1790 

Thr Thr He He Tyr Leu Asn Asp Pro Glx 
1795 1800 



<210> 19 

<211> 9829 

<212> DNA 

<213> Glycine max 

tgaattaata gtagaaaatt agtcaaattt tggcttaaaa 60 
tgtgattaaa agttagaaaa gcaattaagt tgaatttttg 120 
aggtacaaca agcaaaaggc agcagaaagt gaagaaaaag 180 
cagcccaaca cgcgccctta gcgcgcgtca cgcgctaagc 24 0 
taagcgaggc gttaagcacg aagatgcagg attcgttacg 300 
cgctaagcgc gcgatccaac agaagcacac gctaagcctg 3 60 
tacgaaggcc caaagcccat ttctacacct ataaatagag 420 



<400> 19 

tgataactgc taaataattg 
ttaattattt agcagttatt 
gccatagata tgaaaactga 
aataaaatct gaagcagacc 
ttgcaaggca gcacaggcac 
tgcgctaagc gcgaggcaca 
cagcatgcgc taagcgcgcc 
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atccaagcca agggagaatg tacaccttgc ctcagagcac ttctctcagc attccaagct 480 
tgagctctcc cttttctctc tatattcttt gcttttatta tccattcttt ctttcacccc 540 
agttgtaaag cccctcaatg gccatgagtg gttaatcccc tagctacggc ctggtaggcc 600 
taaaaagcca atgatgtatg gtgtacttca agagttatca atgcaaagag gattcattcc 660 
aggttttatg ttctaattct ttccttttta tcttgcattt atgtcttaaa tttctgttgg 720 
gttttattcg ctcgggagag ggtatttcct aataagggtt taagaagtaa tgcatgcatc 780 
agttttaggg gttatacgct tggtaaaggg taacacctaa tagaacaaat taagaaaagg 840 
atcgtcgggc tagcattgct aggcatagaa tgatggccca atgcccatgc atttagcaac 900 
atctagaatt taaccttaat gcattttaat tattgaatct tcacaaaggc atttgggaga 960 
taggtagtta aaataggctt gtcatcgtga ggcatcaagg gcaagtaaaa ttaatagatg 1020 
tgggtagaac taattcaact gcattggtaa tgaacatcat aaattcattc atcgtaggcc 1080 
aattaggttt gtccggtctt ggcattttca tcaattgtct tcctaaatta tttgatctaa 1140 
tagcaacaat ttattcttat gcctattcct gtttttacta tttactttta cttacaaatt 1200 
gaagagtatt caataaagtg caataaaatc cctatggaaa cgatactcgg acttccgaga 1260 
attactactt agaacgattt ggtacacttg tcaaacacct caacaagttt ttggcgccgt 1320 
tgtcggggat tttgttctcg cacttaattg ccatactata ttagtttgta agcttaattc 1380 
ttcttttctt ggctcattct tttattattc tttactttac tttttcttct atcctttctt 1440 
tcttctccca taaattgcac gggtagtgcc tttttgtttt tatacgaggt agaactgcat 1500 
ctggagacgt tgttcctatt aacttagaaa ttgaagctac gtgtcggcgt aacaacgctg 1560 
caagaagaag aagggagcaa gacatagaag gaagtagtta cacctcacct cctccttctc 1620 
caaattatgc tcagatggac ggggaaccgg cacaaagagt cacactagag gacttctcta 1680 
ataccaccac tcctcagttc tttacaagta tcacaaggcc ggaagtccaa gcagatctcc 1740 
tactcaaggg aacctcttcc atggtcttcc aaatgaagat ccatatgcgc atctagcctc 1800 
atacatagag atatgcagca ccgttaaaat cgccggagtt ccaaaagatg cgatactcct 1860 
taacctcttt tccttttccc tagcaggaga ggcaaaaaga tggttgcact cctttaaagg 192 0 
caatagctta agaacatggg aagaagtagt ggaaaaattc ttaaagaagt atttcccaga 1980 
gtcaaagacc gtcgaacgaa agatggagat ttcttatttc catcaatttc tggatgaatc 2 04 0 
ccttagcgaa gcactagacc atttccacgg attgctaaga aaaacaccaa cacacagata 2100 
cagcgagcca gtacaactaa acatattcat cgatgacttg caaccttaat cgaaacagct 2160 
actagaggga agatcaagct gaagactccc gaagaagcga tggagctcgt cgagaacatg 2220 
gcggctagcg atcaagcaat ccttcatgat cacacttatg ttcccacaaa aagaagcctc 2280 
ttggagctta gcacgcagga cgcaactttg gtacaaaaca agctgttgac gaggcagata 2340 
gaagccctca tcgaaaccct cagcaagctg cctcaacaat tacaagcgat aagttcttcc 2400 
cactcttctg ttttgcaggt agaagaatgc cccacatgca gagggacaca tgagcctgga 2460 
caatgtgcaa gccaacaaga cccctctcgt gaagtaaatt atataggcat actaaatcgt 2520 
tacggatttc agggctacaa ccagggaaat ccatctggat tcaatcaagg ggcaacaaga 2580 
tttaatcacg agccaccggg gtttaatcaa ggaagaaact tcatgcaagg ctcaagttgg 2 64 0 
acgaataaag gaaatcaata taaggagcaa aggaaccaac caccatacca gccaccatac 2 700 
cagcacccta gccaaggtcc gaatcagcaa gaaaagccca ccaaaataga ggaactgctg 2 760 
ctgcaattca tcaaggagac aagatcacat caaaagagca cggatgcagc cattcggaat 282 0 
ctagaagttc aaatgggcca actggcgcat gacaaagccg aacggcccac tagaactttc 2 880 
ggtgctaaca tggagaagaa ccccaaggaa gaatgaaaag cagtactgac ttgagggcag 294 0 
agaagagcgc aggaggaggg taaggttgaa ggagaagact ggccagaaga aggaaggaca 3 0 00 
gagaagacag aagaagaaga gaaggtggca tcaccaccta agaccaagag ccagagagca 3 06 0 
agggaagcca agaaggaaga accactagcc cttccacagg atctcccata tcttatggca 3120 
cccaccaaga agaacaagga gcgttacttt agacgtttct tggaaatatt caaagggtta 3180 
gaaatcacta tgccattcgg ggaagcctta cagcagatgc ccctctactc caaatttatg 3240 
aaagacatcc tcaccaagaa ggggaagtat attgacaacg agaatattgt ggtaggaggc 33 00 
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aattgcagtg cgataataca aaggaagcta cccaagaagt ttaaagaccc cggaagtgtt 3360 
accatcccgt gcaccattgg gaaggaagcc gtaaacaagg ccctcattga tctaagagca 3420 
agtatcaatc tgatgccctt gtcaatgtgc aaaagaattg ggaatttgaa gatagatccc 3480 
accaagatga cgcttcaact ggcagaccgc tcaatcacaa ggccatatgg ggtggtagaa 3540 
gatgtcctgg tcaaggtacg ccacttcact tttccggtgg acttttttat catggatatc 3600 
gaagaagaca ctgagattcc ccttatctta ggcagaccct tcatgctgac tgccaactgt 3660 
gtggtggata tggggaatgg gaacttagag ttgactattg ataatcagaa gatcaccttt 3720 
gaccttatca aggcaatgaa gtacccacag gagggttgga agtgcttcag aatagaggag 37 80 
attgatgagg aagatgtcag ttttctcgag acaccataga cttcgctaga aaaagcaatg 3840 
gtaaatgctt tagactgtct aaccagtgaa gaggaagaag atctgaaggc ttgcttggaa 3 900 
aacttggatc aagaagacag tattcctgag ggagaagcca atttcgagac gctagagaag 3960 
gaagttccgt ctgagaagaa gaagatagag ttgaagatat tgcctaatca tttgaagtat 4020 
gtgttcttgg aggaagataa gcctatagtg atcagtaatg cactcacaac agaggaagaa 4080 
aataggttgg tagacgtcct aaagaaacac agggaagcaa ttggatggca catatcggat 4140 
ctcaggaatt agccctgcct actgcatgca catgataatg atggaagagg actacaagcc 4200 
agtccgacaa ccctagaggc ggctgaatcc aacaatgaag gaagaggtaa gaaaggaggt 4260 
gctcaagctt ttggaggctg ggttcatata ccccatctct gatagcgctt gggtaagtcc 4320 
agtacaggtg gttcctaaga aaggcggaat gacagtggta cgaaatgaga ggaatgactt 4380 
gataccaaca cgaactgcca ctggttggtg gatgtgtatc gactatcgca agttgaatga 4440 
agccacacag aaggaccatt tccccttacc tttcatggat tagatgctgg aaaggcttgc 4500 
agggcaggca tactactgct tttggatgga tattcaggat acaaccagat cgcggtagac 4560 
cccagagatc aggagaagac ggcctttaca tgccccttcg gcgtctttgc ttacagaagg 4620 
atgtcattcg ggttatgtaa cgcactagcc atatttcaga ggtgcatgct agccattttt 4680 
tcagacatgg tggagaagag catcgaggta tttatggacg acttctggat ttttggaccc 4740 
tcatttgaca actatttgag gaacctagag atggtactac agaggtgcgt atagactaac 4800 
ttggtactaa attgggaaaa gtgtcatttc atggttcgag agggcatagt cctgagccac 4860 
aagatctcag ccagagggat tgaggttgat cagacaaaga tagacgtcat tgagaagttg 4 920 
ccgccaccaa tgaatgttaa aggtgtcaga agtttcttag ggcatgcagg tttctacagg 4980 
aggtccatca aggacttctc gaagattgcc aggcccttaa gcaatctgtt gaataaggat 504 0 
gtggctttta agtttgatga agaatgttca gcagcatttt tagacactaa agaataagct 5100 
caccactgca ccagtaatga ttgcaccaga ctggaataaa gattttgaac taatgtgtga 5160 
tgccagtgat tatgcagtag gagcagtttt gggacagagg cacgacaagg tatttcacgc 5220 
catctattat gctagtaagg tccttaataa agcataacta aattatgcga ccacagaaaa 5280 
gcagatgcta gccattgtct tttccttgga gaagttcagg tcgtacttga tagggtcgag 5340 
ggtcaccatt ttcacaaatc atgctgccat caagcacttg ctcgccaaaa cagactcaaa 5400 
gctgaggttg attagatggg tcctgctgat acaagaattt gacatcatca tcaaggacaa 5460 
taaaggatcc aagaatgtgg tagccaatca tttatcctga ttaaagaatg aagaagtcac 5520 
caaggaagaa ccagaggtaa aaggagaatt tcctgatgaa tttcttttgt aggttaccac 5580 
cagaccttgg tttgcagaga tggctaacta caaagccaca ggagtcattc cagaggagtt 5640 
taattggagt cagaggaaga aattcttgca tgatgcacgc ttctatgtgt gggataatcc 5700 
tcatttgttt agggcaggag ctgataatct attaaggaga tgcgtcacaa aggaggaagc 5760 
acagagcatt ctttggcact gccacagttc accctatggc ggacaccaca gtggggacag 5820 
aacagcagca aaagtgctac aatcaggttt tttctggcct tctattttta aagatgctta 5880 
cgagtttgtg cgttgttgtg ataaatgcca gagaacaggg gggatatctc gaaggatgga 594 0 
gatgcctttg cagaatatca tggaagtaga gatctttgac tgttggggca tagacttcat 6000 
ggggcctctt ccttcttcat acgagaatgt ttacatcctg gtagctgtgg attacgtctc 6060 
caaatgggtg gaggccatag ccattccaaa agacgatgcc agggtagtga taaaatttct 612 0 
gaagaagaac atcttttccc attttggagt cccatgagcc ttgattagtg atggggaacg 6180 
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cacttctgca ataatcagtt gaagaaagtc ctggagcact ataatgtaag acataaggtg 6240 
gccacacctt atcaccctca gacaaatggc caagtagaaa tttctaacaa agagctcaag 6300 
cgaatcctgg agaagacagt tgcatcatca agaaagaatt gggccttgaa gctcgatgat 6360 
actctttggg cctacagggc agcattcaaa actcccatcg gcttatcacc gtttcagcta 6420 
gtgtatggga aggcatgtca tttaccagtg gagctggagc acaaagcata ttaggctctc 6480 
gagttactca actttgataa caacgcatgc ggagaaaaga ggaagctaca gttgctggaa 6540 
ttagaagaga tgagactgaa tgcctacgag tcatccaaaa tttacaacca aaagatgaag 6600 
gcatatcatg acaagaagct acagaggaaa gaattccaac catggcagca ggtattactc 6660 
tttaaatcaa ggctaaggct attcccaggt aagctgaagt ccaagtggtt agggccgttc 6720 
ataatcaatg aagtcagacc tcacggagca gtagaattgg gggaccctag agaagagaac 6780 
tttgagaaga aatggatcgt caatggacaa cgcttaaagc tttataacga aggacaacta 6840 
gagcgattga cgaccatcat ctacttgaat gacccttgag gaggcctagt gtctagctaa 6900 
agacaataaa ctaagcgctg gttgggaggc aacccaacat attttgtaaa aatgtagtca 6960 
tttttctgta ttccttcaaa aaaaaaggga aaagcccaat aggtgcaaat agaaaacagc 7020 
aggtgcagaa agtaaagacc cagtaggtga agtcagcaat aggaggggtg ccaatagaag 7080 
aagcgaagtg ggctgcacga agccacgcgc atctaggcgc taagcgccta ggtatatttt 7140 
caatttttaa attttaaaaa ttctgaggga aaccaaggga cgcttccctt ggtatgctta 7200 
gcgaccagat gcgcgctaag cgcgcgaacc ataaattgct ggacagtttt caaaactgtc 7260 
ccacccctca gctgcccttt tgtattttaa atttcaacca cctcattttt ttttctcttc 7320 
tgcgcactcc cactccctat accctttttc tctacatttc ctctaaactt actcgcctcc 7380 
ctgtgcctct tcacgtagtt tttacgaaaa taggtgagat tgggaatctg gactgttgct 7440 
gtaatacttt gcaggtacca tcacgctaag ccctacacaa aggcttagcg agaaaaagaa 750 0 
acatagaaag gaagaaagaa gcatgcgcta agcctgcgcc agacaggaca agaaaacaca 7560 
gcatgcgttt agccggcacc tcgtgctaag cgcgctcatg agactcagtg aacgcgctaa 762 0 
gcatggggct gggccttagg gcccatcagc cctcgtgcct tactttctgc accctctttt 7680 
tcactaacta cactcccttc tgaatttctt tttgcaccct cctctattac taaccacaat 7740 
ctatttttcc gtctttgttt ctttgttttt tcagatggcc tcccgcaaac gccgagctgt 7800 
gcccacacct ggggaagcat caagctggga ctcttcccgc ttcacctcgg agatcatttg 7860 
gcatagatac caggataaca ttcagctccg gaacattctt ctggagagga atgtcgagct 792 0 
cacacccagg atgtttgatg agttcctcca ggagctccag aggtgcagat gggaccaggt 7980 
gttaacccga cttccagaga agaggattga tgtcgctctg gtgaaggagt tttactccaa 8040 
cttatatgat ccagaggacc atagtccaaa gttttgtagg gttcaaggac aggtcatgtg 8100 
gtttgatgca gagacgatta acgacttcct tgacacccca gtcatcctgg cagatgtaga 8160 
ggagtaccca gcctactctc agtacctccg cactcctccc gatcatgatg ccatcctctc 8220 
cactttgtgt actccagggg gacggtttgt tctgaatgtt gatggtgccc cctagaagtt 8280 
gctgcggaag gatctgacga cactcgctca gacatagagt gtcctttctt attttaacct 8340 
tgttcttact tctcacactt ctgatattaa tgttgacagg gcccgtctca tatatggctt 8400 
ggtgatgaag atggacctgg acgtggacag ttttatttcc cagcaaatca gtcagatcgc 8460 
ccaatccaac acatccaggc tcgggttccc agcgttgatc acggcactgt gtgacattca 8520 
9999gttgtt tctaacaccc tgatttttga gttactcaat cctatgatta accttgcgta 8580 
cattacacta ctaaaaaaaa gctattttac gacgcgcgtt ccacatcgtt tctgccaaaa 864 0 
atgtcgtaat aggagtagcg gtggcaattc cgtaaataag tgagcatttt atgtgccatg 8700 
tgcatggcgc gtgacacatt caacgacgtt ggccatgggt gcccgtcttt gtaggtggcg 8 760 
cgctggtaac ttaagacggt gcacttaaaa acatcgtcgt tgaaattttg aatttcgaag 8820 
acgttgctct taagccaccg tcgttaaggt tgatgtatat aatgttgtaa tttgcgctat 8880 
ttcgtgaaca ctcgctcgag ctcccgcttc cctgtgtgtc tgaaatttct gtgtactgtg 8 940 
acctcgccat gacttgtggc gtttgcccac acccccgtca cctcgtccgg catctcgtct 9000 
tgtggtggca ccgccgaagc cagtgagtac ccctttttgg aggggtcgta acacggctgt 9060 
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gttttgaagg taaggttgtg cgaagatttg atgctccata gttgttactt gctctgagtt 9120 
tttcttttag tgatgtatct tttacccctc tttcagtgct tcttccctca gaatttgatt 9180 
gccggtatta gaaccccact attcatcagg tccaaacaag cttaaatcat ggtaaatgta 9240 
cttcttgaca aatccaacat ttgcaaggtg gtttgacata tgagaaatag ctttaaccta 93 00 
atgttcttaa atttattatg aagctctcta gcgattacga aaatctctca atatcttctc 9360 
tctctgtctc acatgcatca ctgtaagata ggtgtcaaaa agaaaggatt gaagttaaat 9420 
ttaaacctaa tgttttgaaa tgaaggaaaa aaagaaagag attaatgacg ctagggaact 9480 
tgaatgaaga aagagaaagg aacataatta gtcctttgaa ctgattgggg tggggagtgt 9540 
ggcacgaaac ataatttcta gttctatgga tttattcgtg acactgtggt aggaccaagc 9600 
aaactctgcc cccagagtgc gcagtgtctt gcagtctgag aggttctttt gttgggctag 9660 
tttgaggaat tcttcattgc agggttgagc acggtggcca atggccaagg agagaaaaga 9720 
cagtactgtc aaaatggtta atggtaagat gagtgaagat gacatgtttt tttgttgtct 9780 
ctttgtgtgt ttccttttgg tgggaaaatg tgatgcatag agagatcga 9829 



<210> 20 

<211> 12571 

<212> DNA 

<213> Glycine max 

<400> 20 

gatcttaaat tcttaaactt tgataacagt gcatacggag agaagagaaa gttgcagtta 60 
ctggaactcg aagaaatgag gttgaacgct tacgaatcat ctaggattta caagcagaag 120 
gtaaaggcgt atcatgataa gaaattacaa aagaaagaat tccagccagg gcagcaagta 18 0 
ctactcttca actccaggtt gagattattc acaggaaagc tgaagtcaaa gtggtcagga 240 
tcgttcatta ttaaggaaat cagacctcac ggagcggtag aattggtgga ccctcgagaa 300 
gaaaattatg agaagaaatg gatcgtcaac ggacaacgct taaaaattta caatggagga 360 
caactagaga agttgacgac catcatgcat ttaaaagatt cttgaaagaa gccctatgtc 42 0 
tagctaaaga cattaaacta agcgctggtt gggaggcaac ccaacatact tatgtaaggt 480 
atttataagt atttatattc tgtctttatt atattttgca gttgttattt caggttaaaa 540 
gaaaaaacag gggccctccg gactcgcacc agagtatcaa cgtccatatc tgaggcaccc 60 0 
cctacttctc agccttccgc tccatcacct actgatcttc atgctcagat gttgcggtct 660 
attcacacag gacaggagac ccttatggag aacatgcaca agctgtcctt tcatctacat 720 
atggatccac cactgatcac tccataggtc tatcgtcagc gggtcgtctg gccatgagac 78 0 
cagctctcca ctgacagggg ggaagagccc tctggagatg ctgcagttga tgaagacctc 840 
atagcagact tggctagtgc tgattggggt ccatgggcag atttgggagg cggcacagga 900 
cactggtttt atttttcttg atgtttttgt ttatgtttaa tgtttatgtt ttatgtcttt 960 
atgttttatt tggtttctag ttattatggt cttaattgta gttttatgtt caaaatgaaa 1020 
agcagtggta ataatattag atttgagcat atgcgtgaat aaataaattg catgataact 1080 
tgagaaatga caattttgag tttgttctaa aaggtccaac actggaaagg ctactagtca 114 0 
ttggaaagca ctggtcttgg aagcaaaagt caaatcaagg aatgaaacat gattcacgga 12 00 
aaaggaaagg ttagcttgat ggaatgaaga cacatctggt acgccaatac tgaattaatc 1260 
ccggtgagag tgtgacctta attgtgagag aaaacgcctg tttttaagct cttagttttg 1320 
catcattctt ggactgttaa aattagttac ttaaggtgga tatgatcaag gccatgtttg 1380 
ttttatttta cccactcagc caaaaagcca acccaacata attttatccc ttgcacccat 1440 
attgagccaa aaagaattat aatgatttat ttgagtaaac ccctgagcca agaaattgat 1500 
attcctaacc ttgtgtagga ttctaagaga gcagtagggt tccaaatgct tataaggcct 1560 
tattttgggg gattttgaac aaatgggtaa agtagccaag gtaataacac acattagaac 1620 
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acctctaaat aattgtgagc ccattactat tattattatt attattatta ttattattat 1680 
tattattatt attattatta ttattattat tattggttat aaaaaaaaga agaaaaaaag 1740 
agaaagaata agaagagaaa gggcaaagaa aaaaaatgaa aaagagaggt ttcagtggaa 1800 
agtgctgaag gcaaaaaagg ctaagtggga aataggtctt ggcaagacct taaatttttg 1860 
gaatgtatgc tctcttataa ccttatattt tgaatttcca agaaaaacca tgattctttg 1920 
ttagccaggc cccattacaa ggcatgaaag tccttagtga cccaccgaag gtaattaagg 1980 
ctaaccttaa ccaagatgaa gtacaaaact cttgagtttt atttacaggt tgttaaaatt 2040 
gcaaacactt gaccaggcac ttgtgagtag agagaaacac cagttttgta aggaagtaag 2100 
gcaagccgga cctgttggaa ttccatataa ttgacttgtt tctgctcttg tgtttatgct 2160 
tttatttcaa gatcatgaca gatgcaaaga gaccagccaa aggatcaagg aattgaagtc 2220 
atggagagtg ttggaatgat tggaacttgc ttgagaaaat ttttgcttaa gaatggaata 2280 
attttattct ttttatttgc ttggggacaa gcaaagttta atttggggga ttttgataac 2340 
tgctaaataa tagtgaatta atagtggaaa attggtctga aattaactta gaattaatta 2400 
tttagtagtt atttatgctt taatttggaa agatttaatt aattttgaat tctgattgca 2460 
gatgtgaaaa agggaggtac aacaagcaaa aaggagcaaa aataaagaaa aagaagaaga 2520 
aaatcagacg aagacccaag cccaaatttt cacctataaa taagaaggtc agcctagcaa 2 58 0 
aacacacaca ctttcagaga gctcagtttt cagacttctg gcactcagtt ctctccttct 2640 
ccttcccttt ttcttatatt cttattacct ttctttcacc cccttctcat tgtaaagccc 2700 
tcttgactat gagtggctaa acccctagct agggcctggc aggcctaaaa agccaatgat 2760 
gtatggagca tttcaagagt tatcaataaa gagaggattt ccttccaggt tctttattta 2820 
ccgttctttc ttatttatcc tgtatttcgg accttatttt ctgttagggt ttagtccact 2880 
cgggagaggg taaagcctaa ttaggggtaa ggaatgaata cttgaatcta ttttaagggt 294 0 
tagtccattc gggagagggt aaagcttaat agaacaataa aaggaagaaa ttatcgggtt 3 000 
atcattagag ggttttcctt ccaggttctt ttatctgctt ttctttctta ttctgcatct 3060 
cagtctttat tttctgttag tctttagtcc actcgggaga gggtaaagcc taattaaggg 312 0 
taaggaatga ttgcgtgaat ctgttttaag ggttagttca ctcaggagag ggtaacgctt 3180 
aatagaacaa taaaagaaaa aaatcacagg gttagcattg acccgatgcc catactttag 324 0 
caaacatata gaatttaatc ttaatgcatc ttagttattg agtctttgca aagggcattt 3300 
ggaagatagg taattaaggt aggcttgtca tcatgaggca tcaggggcaa gtagatggat 3360 
agatgtgggg cagaatcagt tcactggtat tgataacaga caaatcttga atccatatat 3420 
ctaggctgat tagacttttt aggttttagc aattttatta tatagatttt attccctatt 3480 
ttattgtttg aagtttctta ttctattgtt gggttttctt agaagtagct attccttatt 3540 
ttactgttgg gttttcttag aaatagttat tccttattgt tgggtttctt agaagtagtt 3600 
attccttatt ttactgttgg gttttattag gagtacttat cccctgttta ggagtaggta 3660 
tttaggctta ttagatttag taatatttta tagactttat tctttattta ttgcttgagt 3720 
ttcctttaat ttagaagtag ctgcttagat ttaaattact ttatctttat cctttaatct 3780 
tatctttaaa tcttttatct tttccttatc ttatctttta tctttcttta tcttttattt 3840 
caaatttctt atcccttgct agatttaaat tgcatttaat tttatacact aaatttacaa 3 900 
tttgcaaact aaaaagtact tcacataagt gcaacaaaat ccctatggta cgatactcga 3 960 
cttaccgaga gattattact acgagcgatt tggtacactt gccaaagagc taacaaagat 4 02 0 
attgcctgat catctaaagt atgtgttctt ggaggaagat aaacctatag taatcagtaa 4080 
cgcactcaca acaaaggagg aaaataggtt ggttgatgtc ctcaagaaat acagggaagc 414 0 
aattggatgg catatatcgg atctcaagga aattagccct gcttactaca tgcacagaat 42 00 
aatgatggaa gagaactaca agccagtccg acaaccccag aggcggctga atccaacaat 4260 
gaaggaagag gtaagaaagg aggtactcaa gctcttggag gctgggctca tatacccctt 4320 
ctctaacagt gcttgggtaa gcccagtaca ggtggttccc aagaaaggtg aaatgacagt 4380 
ggtacgaaat gagaagaatg acttgatacc cagacgaact atcactggtt ggcgaatgtg 4440 
tatcaactat cgcaagctga atgaagccac acgaaaggac catttcccct tacttttcat 4500 
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ggatcagatg ctagagagac ttgtagggca ggcatactac tatttcttgg atggatactc 4560 
gggatataat cagatcgcgg tggaccccag agatcaagag aaggcggcct ttacatgccc 4620 
ttttggcgtt tttgcttata gaaggatgcc attcgggtta tgtaatgcac cagccacatt 4680 
tcagaggttc atgctggcca ttttttcaga catggtgtag aaaagcattg aggtatttat 4 740 
ggacgacttc tgggtttttg gaccctcatt taacagtttg aggaacctag agatggtact 4800 
ttagagttga gtagagacta acttggtact gaactgggag aagtgtcact tcatggttca 4860 
agagggcatc gtcctaggcc acaagatctc agcaagaggg attgaggtcg atcgggcaaa 4920 
gatagacgtc atcgagaagc tgccaccacc actgaatgtt aaaggggtta gaagtttctt 4 980 
agggcatgca ggtttctaca agaggtttat caaggacttc tcaaagattg ccaggcccct 5040 
aagtaacctg ttgaataaag acatggtttt caagtttgat gaagaatgtt caacagcatt 510 0 
ccaatcattg aagaataagc ttaccactgc acctgtaatg attgcacccg actggaataa 5160 
agattttgaa ctaatgtgtg atgccaatga ttatgcagta ggagcagttc tgggatagag 522 0 
gcacgacaag gtatttcacg ccatctatta tgctagcaag gtcctgaatg aagcatagtt 528 0 
gaattatgca accatagaaa aggagatgct agccattgtc tttgccttgg agaaattcaa 5340 
gtcatacttg atagggttga gggtcaccat tttcacagat catgctgcca tcaagcacct 5400 
gcttgccata acagactcaa aaccgaggtt gattagatgg gtcctactgt tacaagaatt 5460 
tgacatcatc atcaaggaca agaaaggatc cgagaatgtg gtagccaatc atctatctcg 5520 
attgaagaat gaagaagtca ccaaggaaga accagaggta aaaggtgaat ttcctgatga 5580 
gtttcttttg caggttaccg ctagatcttg gtttgcagac atggccaatt acaaagccac 5640 
gggagtcatt ccagaggagc ttaattggag tcaaaggaag aaattcttgc acaatgcacg 570 0 
cttctatgtg tgggatgatc ctcatctgtt caaggcagga gcagataatt tactaaggag 5 760 
atgcgtcaca aaggaggaag cacggagcat tctttggcac tgccacagtt caccctatgg 582 0 
cggtcaccac agtggggaca gaacagcagc aaaagtgcta caatcaggtt ttttctggcc 588 0 
ctctattttt aaagatgctc acgagtttgt gcgttgttgt gataaatgcc aaagaacagg 5940 
ggggatatct cgaagaaatg agatgccttt gcaaaatatc atggaagtag agatctttga 6000 
ctgttggggc atagacttca tcgggcccct gccttcgtta tatggaaatg tctacatctt 6060 
ggtagttgtg gattacgtct ccaaatgggt ggaagtcata gctacgccaa aggatgatgc 612 0 
caaggtagta atcaaatttc tgaagaagaa cattttttcc cgttttggag tcccacgagc 6180 
cttgattagt gataggggaa cgcacttctg caacaatcag ttgaagaaag tcttggagca 624 0 
ctataatgtc cgacataagg tggccacacc ttatcatcct cagacaaatg gccaagcaga 6300 
aatctctaac agggagctca aggcgaatct tggaaaagac aattgcatca tcaagaaagg 6360 
attgggcctt gaagctcgat gatactctct tggcctatag ggcagcgttc aagactctca 6420 
tcggcttatc gccatttcag ctagtgtatg ggaaggcatg ccatttacca gtggagctag 6480 
agcacaaagc atattgggct ctcaagttgc tcaacttcga caacaacgca tgcggggaaa 6540 
agaggaagct acagatgttg gaattagaag agatgagact gaatgcctac gagtcatcca 6600 
gaatttacaa gcaaaagatg aaggcatatc atgataaaaa gctacagagg aaagaattcc 6660 
atccagggaa gcaggtatta ctctttaact cgaggctaag gctattccca ggtaagctga 672 0 
agtccaagtg gtcaaggcca tttatcataa aagaagtcag acctcatgga gcagtagaat 6780 
tggtggaccc ttgagaagag aactttaaga agaaatggat cgtcaatcga cagcgcttga 684 0 
agccctacaa cggaggacaa ctcgagcgat tgacgaccat catctactta aatgatcctt 6900 
gagaaggcct actgtctagc taaagacaat aaactaagca ctggttggga ggcaacccaa 6960 
catatttttg taaaaatgta gttattttta ttttatgtaa aaaaaaacaa gagggcccaa 702 0 
taggtgcaaa tagcaaacag gaggtgcaaa aagcaaaggc ccaacaggtg aagacaacaa 7080 
taggaagggt gccaatagca aaactgaagt gggctgcatg aagccgcgcg ctaagcgccc 7140 
aggtatgttt ttaaaatctg atgggcaacc aagggacgct ttccttggtg cgcttagcgg 720 0 
ccacatgcgc gctaagcgcg taagtcataa attactggac agttttcgaa actgcccaac 7260 
ccctcagctg cctcctccgc gttattaaat tacaaccatt tcatttcatt atccttcttt 7320 
tctttcgcaa atctaccctt ctttgcacct ctgctactgt aacccctgaa ttcttggtct 7380 
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tttcacacaa aacaatcact aacgaaggta aagaattgct ttgtatggat gttgttatga 7440 
atgcacaggt aacagcacgc taagccctgc tcgacgctta gccaatgaag acggattgaa 7500 
ggccataacg acgagctcgt taagcgtgac gaagcacgct aagcaggcgc ctgacaggac 7560 
gagaaagcaa agcgcgcgct tagccggcac ttccgcgcta agcgcgctca tgaacatcac 7620 
tgaacgcgct aaacgtgtgc cagaggcgct aaacgcgtgc cagaggcgct aaacgcgtgc 768 0 
attagtcaca gcaggatggt gctaagcgcg gggttgggcc tcagggccca tcaaccctcg 7740 
caccttactt gttgcacccc tatttctact attcccactc ccttctaatt tctttttgca 7800 
ccccccttct ttactgactg cacctctatt ttgattactt tttgcacccc ccctgattgc 7860 
taacttcaga ctatctttct tgttttttgt ttttttggtt ttttggtcag atggcctcct 7920 
gtaaacaccg agctgtgccc acacccgggg aagcgtccaa ctgggactct tcacgtttca 798 0 
ctttcgagat tgcttggcac agataccagg atagcattca gctccggaac atccttccag 8 04 0 
agaggaatgt agagcttgga ccagggatgt ttgatgagtt cctgcaggaa ctccagaggc 8100 
tcagatggga ccaggttctg acccgacttc cagagaagtg gattgatgtt gctctggtga 8160 
aggagtttta ctccaaccta tatgatccag aggaccacag tccgaagttt tggagtgttc 8220 
gaggacaggt tgtgagattt gatgctgaga cgattaatga tttcctcgac accccggtca 828 0 
tcttggcaga gggagaggat tatccagcct actctcagta cctcagcact cctccagacc 834 0 
atgatgccat cctttccgct ctgtgtactc cagggggacg atttgttctg aatgttgata 8400 
gtgccccctg gaagctgctg cggaaggatc tgatgacgct cgcgcagaca tggagtgtgc 8460 
tctcttattt taaccttgca ctgacttttc acacttctga tattaatgtt gacagggccc 8520 
gactcaatta tggcttggtg atgaagatgg acctggacgt gggcagcctc atttctcttt 8580 
agatcagtca gatcgcccag tccatcactt ccaggcttgg gttcccagcg ttgatcacaa 8640 
cactgtgtga gattcagggg gttgtctctg ataccctgat ttttgagtca ctcagtcctg 8700 
tgatcaacct tgcctacatt aagaagaact gctggaaccc tgccgatcca tctatcacat 8760 
ttcaggggac ccgccgcacg cgcaccagag cttcggcgtc ggcatctgag gctcctcttc 8 82 0 
catcccagca tccttctcag cctttttccc agtgaccacg gcctccactt ctatccacct 8880 
cagcacctcc atacatgcat ggacagatgc tcaggtcctt gtaccagggt cagcagatca 8940 
tcattcagaa cctgtatcga ttgtccctac atttgcagat ggatctgcca ctcatgactc 9000 
cggaggccta tcgtcagcag gtcgcctagc taggagacca gccctccact gacagggggg 9060 
aagagccttc tggagccgct gctactgagg atcctgccgt tgatgaagac ctcatagctg 912 0 
acttggctgg cgctgattgg agcccatggg cagacttggg cagaggcagc tgatcttatg 9180 
ctttaatgtt ttcttttata ttatgtttgt gttctctttt atgttttatg ttatgttttt 9240 
atgtagtctg tttggtaatt aaaaagaggt agtagtaaaa atattagtat ttcagtatgt 9300 
gttttctgag taataagtgc atgataactc aagcaatcat aattctttag cttgttcaga 9360 
aaggttcaac acttgagatg ccactgatcc ttggagaaac actggttctg gaagcaaaag 9420 
tcaggtcaag aaatggaaca tgaatagcac agagtggaaa ggttagcttg atggaacaag 9480 
gtcataactg gtacgccgaa tacttgttta agtccctgtg agcatggttg tcaaactcta 9540 
gagtcaactc atagactctc atgagtttaa gagtttactt cagtcccgcg agttgactcg 960 0 
gaagcaaact cgcttttgag caaactcgtg gactcggagt gaactcatgt aaactcgtaa 9660 
gagtctacga gttgactcta gagtttgaca accatgcata agtgttcaaa attaaagcat 9720 
ttaaataatt aaaaaaagca caaatgtctt caaagaagca tgttcaatcc tctaatagga 978 0 
tcatcttcat gaatatcatc actttcatca tcatctccat ctccatcatc atcatcaagg 9840 
tcttcctcag attgtgcatc atcattaggt tccacaaaga ttaaattatc tagatcaaaa 9900 
gcttaaaata gatatcaaat atgctatatt agaaatagtt aaaacttaaa ataatacaca 9960 
agcaaatttt aaatatgaga aagttcagaa attatacctt ttcttggtgt tattaaagtt 1002 0 
tcattttatc ttctcttttg cattttccat ctcctcacat atgaaaagca taattctatt 10080 
gaatttcagt aacaagtttg atccaactcc aacattgtaa ggtcagttgt tgtgttttgt 10140 
aatagactaa tatgaagtat gaagtatgaa ctatgaactt attgtcatct gtttgcaaat 10200 
tggtgcattt tgaatatatt tacttattat ccattttttt ttttttacga agtagactct 10260 
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cacgagtctg cgtagactct cgatatcgat aaccttgccg atgagagtgt gaacttaatt 1032 0 
gtgagagaaa atgcctattt ttaagttcct ggttttgcat cattcttaga cggttagaat 103 80 
agttacttaa ggtggatatg atcaaggcca tgtttgtttg tttacctact tagccaaaaa 1044 0 
gccaacctaa catagtttta ccccttgcac ccatgattga gccaactgat tattttgaat 10500 
taaccttgag ccaattaaac aaaatcctga ccttttagga ttttaagaga gtaaaaatgg 10560 
gttataaagg tcttaatttg ggggattttg ggaaataggt agccaagaca ataagtacag 10620 
cacacaaagt aggacacctt ttacaaacag taggcccaat ttcgaaaaaa aaatgaaaag 10680 
aatttaataa agggcagaaa caaaagagca agagaggtgt caaaagaaaa gtgttgtggg 1074 0 
gaaataaaag ggctaagtaa aaaggcctag gcagaattgg aaatttttgt tctcttttaa 10800 
tcctaacttt gaatttccaa gaaaaaccat gattttttgt aagccaggcc ccgatacaag 10860 
ccaataaagt ccttagtgat ccaccaaagg taactagaga taactgtaac tgagatgaaa 10920 
tgcaaaattt tgaagtgtta cttgcaggtt gttatcaaat tgcaaacact aaactaggca 10980 
cttgtgagca gagggaaaca ccagccttgt gaggaaagta aggcaagcca aatttgattg 1104 0 
agttccagat gactaactga ttcaattctt ctgttgtaat gctttcattt taagatgttg 11100 
acagatgcag aaaggaccag tgaaagaagg aggaactgag ccattgatag tgttggaata 11160 
tttaagaact tgcttgagaa tttacttgtt tttggttttc ttggggacaa gcaaagtttc 11220 
atttggggaa ttttgataac tgctaaataa ttgtgaatta atagtaaaga attattcaaa 11280 
ttttggcctg aaattaatta tttagcagtt atttgtgatt aaaagttaga aaattaatta 11340 
aattgaattt ttggttgcag ataagaaaat tggagttaca ttaagcaaaa aaggcaacaa 114 00 
aaaatgaagg aaaagaagaa gtctgaagca ggcccagccc aacacgcacg ctaagcgcgt 11460 
gtcacgcgct aagcgtgcaa ggcagtacag gcgctaagcg aggcgttaag ctcgaagatg 11520 
cagaatccgt tacgcgcgct aagcaagggc cacgcgctaa gcgtgcgatc caacagaaac 11580 
acacgctaag cctgcatctc gcgctaagcg cgcgatctga acgcgctaag cgcgaggtgt 1164 0 
cgcgctaagc gcgcttacga aggcccaaaa cccactttag cagctataaa tagagagtca 11700 
gtccaaggga aacaacacat ctcgcctcag agcacttccc tcagcattct aagcctaagc 11760 
tctccctttt ctctttgttt ttattatcct cattctttct ttcaccccca gttgtaaagc 11820 
cctcaatggc catgagtggc taatctagta gctagggcct ggcaggccta aaaagccaac 11880 
gatatatggt gtacttcaag agttatcaat gcaaagaaga ttcattccag gtttttttgt 11940 
tctaattatt ttctttttat cttgcattca tttcttgaat ttcttttggg ttttatttgc 12000 
tcgggagagg gtatttccta ataagggttt aaggattaat gcatgcatca gttttagggg 12060 
ttatacgctt gggaaagggt aacacctaat agaacatctt aagaaaagaa tcatcgggtt 1212 0 
agcattgcta ggcatagaat gataactcaa tgcccacgca tttagcaaca tctagaattt 1218 0 
taccttaatg cattttaatt attgagtctt cgcaaaggca tttgggagat aggtagttaa 1224 0 
aataggcttg tcatcgtgag gcatcagggg caagtaaaat taatagatgt gggtagaact 123 00 
gttacaaatg cattggtaat gaatatcata tttacatgca tcgtaggcca attgggtttg 12360 
tccggtcttg gcatttatat taattgtctt tctaaaacta tttgatctag taatagcaat 12420 
ctattcttgc acttactcct gtttttacta ttttactctt acaaattgaa aagtattcga 12480 
taaagtgcaa taaaatccct gtggaaacga tactcggact tccgaggttt actacttaga 12540 
gcgatttggt acacttgcca aagtctcaac a 12571 



<210> 21 

<211> 4609 

<212> DNA 

<213> Glycine max 

<400> 21 

gatctcccat atcctatggt acccaccaag aagaacaagg aacattactt ctgacgtttc 60 
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ttggaaatat tcaaaggact ggaaatcacc atgccattcg gggaagcctt acagcagatg 120 
cccctctact ccaaatttat gaaggacatc ctcaccaaga aggggaagta tattgacaat 180 
gagaatattg tggtaggggg caactgtagt gcaataatac agaggaagct acccaagaag 240 
tttaaggacc ccggaagtgt taccatcccg tgcaccatag gaaaggaaga ggtaaacaag 3 00 
gccctcattg atctaggagc aagtatcaat ctaatgccct tgtcaatgtg cagaagaatc 360 
aggaatttga agatagatcc caccaagatg acacttcaac tggcagaccg ctcgatcaca 420 
agaccataca gggtggtaga agatgtcctg gtcaaggtac accacttcac ttttccggtg 480 
gactttgtta tcatggatat cgaagaagac acagagattc cccttatctt aggcagaccc 540 
ttcatgctga ttgccaactg tgtggtggat atggggaatg ggaacttgga ggtgagtatt 600 
gacaatcaga agatcacctt tgaccttttc aaggcaataa agtacccata ggagggttgg 660 
aagtgcttta gaatggagga gattgataag gaagatgtca gtattctcga gacaccacag 72 0 
tcttcgctgg ggaaagcaat ggtaaatgct ttagactgtc taaccagtga agaggaagaa 780 
gatctaaagg cttgcttgga agacttggat tgacaagaca gtattcctaa gggagaagcc 840 
agatttgaga ctctagaaaa ggaagttccg tccgagaaga agaagataga gttgaagata 900 
ttgcccgatc atctgaagta tgtgttcttg gaggaagata aacctgtagt gatcagtaac 960 
gtactcacaa cagaggagga aaacaggtta gtagatgtcc tcaagaaaca cagggaatca 1020 
attggatggc acacatcgga tctcaaggga attagccctg cttactgcat gcacaggata 1080 
atgatggaag aggactacaa gccagtctga caaccccaga ggcggctgaa tccaacaatg 114 0 
aaggaagagg taagaaaaga ggtactcaag ctcttggagg ttgggctcat ataccccatc 1200 
tctgacaacg cttgggtaag cccagtacag gtggttccca agaaaggtgg aatgacagtg 126 0 
gtacaaaatg agaggaatga cttgatacca acacgaacag tcactggctg gcgaatgtgt 132 0 
attgactatc acaagctgaa tgaagctaca cggaaggacc atttcccctt acctttcatg 1380 
gatcagatgc tggagagact tgcagggcag gcatactact gtttcttgga tggatactcg 144 0 
ggatacaacc agatcgcggt agaccccata gatcaggaga agacggtctt tacatgcccc 1500 
tttggcgtct ttgcttacag aaggatgtca ttcgggttat gtaatgtacc agccacattt 1560 
cagaggtgca tgctgaccat tttttcagac atggtggaga aaagcatcga ggtatttatg 1620 
gacgacttct cggtttttgg accctcattt gacagctgtt tgaggaacct agaaatggta 168 0 
cttcagaggt gcgtagagac taacttggta ctgaattggg aaaagtgtca ttttatggtt 1740 
cgagagggca tagtcctagg ccacaagatc tcagctagag ggattgaggt tgatcgggcg 1800 
aagatagacg tcatcgagaa gctgccacca ccactgaatg ttaaaggggt tagaagtttc 1860 
ttagggcatg caggtttcta taggaggttt atcaaggatt tctcgaagat tgccaggccc 1920 
ttaagcaatc tgctgaataa agacatgatt tttaagtttg atgaagaatg ttcagcagca 1980 
tttcagacac tgaaaaataa gctcaccact gcaccggtaa tgattgcacc cgactggaat 204 0 
aaagattttg aactaatgtg tgatgctagt gattatgcag taggagcagt tttgggacag 2100 
aggcacgaca aggtatttca caccatctat tatgctagca aggtcctgaa tgaagcacag 2160 
ttgaattatg caaccacaga aaaggagatg ctagccattg tctttgcctt ggagaagttt 2220 
aggtcatact agatagggtc gagggtcacc attttcacag atcatgctgc catcaagcac 2280 
ctgctcgcca aaacagactc aaagctgagg ttgattagat gggtcatgct attacaagag 234 0 
tttgacatca ttattaagga caagaaagga tccgagaatg tggtagctga tcatctatct 2400 
cgattaaaga atgaagaagt caccaaggaa gaaccagagg taaaaggtga atttcctgat 2460 
gagtttcttt tgcaggttac cgctagacct tggtttgcag acatggctaa ctacaaagcc 2520 
atgggaatca tcccagagga gtttaattgg agtcagagga agaaattttt gcacgatgca 2580 
cgcttatatg tgtgggatga tcctcatttg ttcaaggcgg gagcaaataa tttattaagg 2 64 0 
agatgcgtca caaaggagga agcacgaagc attctttggc actgccacag ttcaccctat 2700 
ggcatacatc acagcgagga tagaacaaca gcaaaagtgc tacaatcaag ttttttctag 2760 
ccctttattt ttaaagatgc tcacgagttt gtgcattgtt gtgataaatg tcagagaaca 2820 
agggggatat ctcgaagaaa tgagatgcct ttgcagaata tcatggaggt agagatcttt 2880 
gatagttggg gcatagactt catggggcct cttccttcat catacaggaa tgtctacatc 2940 
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ttggtagctg tggattacgt ctccaaatgg gtggaagcca tagccacgct gaaggacgat 3000 
gccagggtag tgatcaaatt tctgaagaag aacatttttt cccatttcgg agtcccacga 3 060 
gccttgatta gtgatggggg aacgcacttc tgcaacaatc agttgaagaa agtcctggag 312 0 
cactataatg tccgacacaa ggtggccaca ccttatcaca ctcagacgaa tggccaagca 3180 
gaaatttcta acagggagct caagcgaatc ctggaaaaga cagttgcatc atcaagaaag 3240 
gattgggcct tgaagctcga tgatactctc tgggcctata ggacagcgtt caagactccc 3300 
atcggcttat caccatttca gctagtatat gggaaggcat gtcatttacc agtagagctg 3360 
gagcacaagg catattgggc tctcaagttg ctcaactttg acaacaacgc atgcggggaa 342 0 
aagaggaagc tacaactgct ggaattagaa gagatgagac tgaatgccta cgagtcatcc 3480 
aaaatttaca agcaaaagac aaaggcatat catgacaaga agctacaaag gaaagaattc 3540 
cagccagggc agcaggtatt actcgttaac tcaaggctaa ggctattccc aagtaagctg 3600 
aagtccaatt ggtcagggcc attcataatc aaagaagtca gacctcacag agcagtagaa 3 660 
ttggtggacc ctagagaaga gaactttgat aagaaatgga tcatcaatgg acagcgcttg 3720 
aagccttata acggaggaca actagagcga ttgacgacca tcatctactt aaatgaccct 3780 
tgagaaggcc tactgtcgag ctaaagacaa taaactaagc gctggttggg aggcaaccca 384 0 
acatattttg taaaaatgta gttatcttca ttctatgtaa aaaaaaagcc caacaggtgc 3900 
aaataggaaa cacgaggtgc aaaaagcaaa ggcccaacat gtgaagacaa caataggagg 3 960 
ggtgccaata gcaaaactga agtgggctac acgaagctac gtgcttagct cgcgtccgcg 4 02 0 
cgctaagcgc ccagattgca caaaaatagg tgagacttgg aatctggact attgctgtaa 4 080 
tatcttgcag gtaccattac gctaagccct acacagaggc ttagcgagaa caggcagcat 414 0 
ggaaaaaggg aaggaggagc gcgctaagcc acaacaagta atagaagaaa acgaagcacg 4200 
cgcttagcgg gcactgccgc gctaagcgca ctcttcaaca tcagtgaacg cgctaagcgc 4260 
gtgccagaag cgctaagcgc gtgtcaccgt caccagcagg aaggcgctaa gcgcgaggtt 4320 
gggccttagg gcccatcagc cttcgcgcct tactttttgc acaccccttc tttactaact 4380 
gcacccctat tttgatttct ttttgcaccc cctctgttta ctaactgcag tttgtttctg 444 0 
ctgtttcttg tttttgtttc agatggcctc ctgcaaacgc cgagccgtgc ccacacccag 4500 
ggaagcgtct aattgggact cttcccgttt cacttcagag attgcatggc acagatatca 4560 
ggacaacatt cagctctgga acatcctttc ggagaggaat gtcgagctc 4609 



<210> 22 

<211> 9139 

<212> DNA 

<213> Glycine max 

<400> 22 

acctggttgt ttgtatgctt gtcttaatgc ggataggttg tcaagtagct ttagtgctaa 60 
cactgagaag aatccgaagg aagaatgtaa agttttaatg acaaagagca gaatggaaat 120 
tcaagttgat gaagttagag ctgaagagaa ggtggaggga tataaacaac agtcgatagc 180 
tgagcctgca ctggaactag tttccgatct tattgaactt gaggaagttt tggaagagga 240 
agatgaccaa caggagagag agacaccaat aaaagatagt caagaaggaa taaagatgaa 3 00 
ggaagagcat gaaaaagaaa aacaaaaaga aaaagaagaa atagaaaaag aaaataataa 360 
aaaaaatgaa aaataaaaaa agatggttga tgaggagaaa aaaaagagca agagtgaggt 420 
ttcaagagaa aaaaagagag agattacttc agctgaaggc aaggaagtac cat ate tat t 4 80 
ggtaccttcc aagaaggata aagagcaaca ettagecaga tttcttgaca tcttcaagaa 540 
actggaaatt actttgeett ttggagaagc tctccaacag atgccactct atgecaaatt 600 
tttaaaagac atgctgacaa agaagaacta gtatatccac agtgacacaa tagttgtgga 660 
aggaaattgt agtgctgtca ttcaacacat ccttccccca aatcataagg ateceggaag 72 0 
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tgtcactata ttatgttcca ttagcgaggt tgttgtgggt aaagctctca tagacttggg 780 
agctagtatc aatttaatgc ctctctcaat gtgtcgacga cttggagaga tagagataat 84 0 
gcccacacgc atgacccttc agttggttga tcactccatc acaagaccat atggagtgat 900 
tgaggatatg ttgattcagg tcaagcaact tgtattccct gtagatttcg tggttatgga 960 
tatagaggag gatcctgaca ttcccataat cttgggacgt cctttcatgt ccgcgaccaa 1020 
ctatatagta gatataggga aaggcaagtt agaattgggt gtggaggatc agaaagtctc 1080 
attcgactta tttgaagcaa ataagcatcc aaatgataag aaagcttgct ttgatctaga 1140 
caaggtagaa caataaatag aattagctac tatagccatg gtactgaact ctcctttgga 1200 
aaaagcattg attaatcatg tagaatgtct tactaaagag gaggaacatg aagtgcaaac 1260 
ttgtattaaa gagttggatg gtgcaggaga aaattctgag ggacaggatg catttcaaga 132 0 
attgaagaat ggtgggcaaa tagaaaaacc aaaagtagaa ttgaagacct tgcctgcaca 13 80 
tttgaagtat gtatttctcg aagacaatga ctccaaacca gtgattatta gcagctcgtt 144 0 
gaagaaaata gaagatcaac tggtgaagat tttgaagaga cacaaagctg caattggatg 1500 
gcacatatct gacttgcaag gaattagtcc atcttattgc atgcacaaaa tcaatatgga 1560 
agctgattac aaaccagtga gagagcctca aagaagactg aacccaatca tgaaagaaga 1620 
gatgcataag gaggtgctta aattgtagga agcaggcctt atttacccct cctcggatag 1680 
tgcatgggtt agccttgtgc aggttgtccc caagaaagga ggtatgacag tcattaaaaa 1740 
tgataaagat gagttaatat ccataaggac tgtcaccggg tggagaatgt gcattgacta 180 0 
tcggaagctg aatgatgcca ctcggaagga ccattatcca cttcctttca tggaccaaat 1860 
gcttgaaaga cttgtagggt aatcctatta ttgttttctc gatgagtact ctggctataa 192 0 
ttagattgtt gttgatccta aagatcaaga gaagactgct ttcacctacc cttttggtgt 1980 
attcgcatat cggcacatgc cttttggtct gtgcaatgcc ccagctacat ttcagaggtg 204 0 
tattatggca attttttctg atatggtgga aaaatgcatc gaagttttca tggatgattt 2100 
ctctattttt gggccatcct ttaaggggtg cctattaaat cttgaaagag tattacagag 2160 
atgtgaagag tccaatctag ttctcaattg ggagaaattc catttcatgg ttcaagaagg 222 0 
aatagtgctg gggcataaaa tttcagtaag gggaatagag gtggacaagg caaagattga 2280 
tgtaattgag aaacttcctc ctccaatgaa tgccaaagaa gtgagaagtt tcttatgaca 234 0 
tgcaggattc tacagatgat tcataaaaga tttctcaaaa gtcgcccagc cacttagcaa 2400 
tctgttgaat aaagatgttg cttttgtgtt caatcaagag tgcatggaag catttaatga 2460 
tctgaaaacc agattagtgt ctgctccagt aagtatagca ccagattggg gacaagaatt 252 0 
tgagttgatg tgtgatgcaa gtgactatgt cgtaggtgta gtgcttcgac aacggaaggg 2580 
aaaacttttt catgctatat actacgccaa caaggttcta aatgatgcac aggtgaacta 2640 
tgctaccata gaaaaagaaa tgctggcaat tgtctatgca cttgaaaagt ttagatctta 2700 
tttggtaggt tcaagagtta tcatctacat cgatcacgca gctattaaat atttgctcaa 2760 
caaggctgat tccaaaccta gattgataag atggatcttg ttgttgcaag aatttgattt 2 82 0 
ggtgattcgg gataaaaagg gatcggaaaa tgttgtagct gaccatttgt ctagattggt 2880 
gaatgaggaa gtcacattga aagaagcaga agtgagagat gaattccctg atgaatcatt 294 0 
attcttagtg agtgagagac cttggtttgc cgatatggcc aacttcaaag ctacaagaat 300 0 
catcccaaag gacttaactt ggtagcagag gaagaaattc ctacatgatg ctcgattcta 3 060 
tatctgggtt gatcctcatt tgttcaagat aggagctgac aatctcctat gaagatgtgt 3120 
gacacaagaa gaggccaaga acatattatg aaattgccac aattctccat gtggcagcca 318 0 
ttatggtgga gataagacga tgaccaaggt tttgcaatct ggattctttt ggcccatgct 324 0 
tttcaaagat get cat cage atgtgcaaca ctgtgatcaa tgtaagagga tgaggggtat 3300 
atcaagaaga aatgaaatgc ctctacagaa tattatggag gttgaggtat teaattgeta 3360 
ggggattgat tttgtaggtc ccttcccttc gtcttttggc aatgaatata tactagtggc 3420 
gattgactat gtctctaaat tggttgaagc agtggctacc ccgcataatg atgetaagae 3480 
tgtggtaaag tttctaaaga aaaacatttt ctcaagattt ggggtgccta gaattctgat 3540 
taacgatgga ggcacacact tetgeaataa tcatctatag aaggtgttga agcaatataa 3600 
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tgtgacacaa agtagcatca ccttatcacc cccagaccaa tgggcaagca gaagtatcaa 3660 
acagggaatt gaaaaagatt ttggagaaga ctatagcttc tactagaaaa gactagtcta 372 0 
tcaaattaga tgatgcttta tgggcataca gaacaacatt caagactccg ataggattat 3 78 0 
ctccatttca gatggtgtac ggcaaggctt gtcacttacc agtggagatg gaatataaag 384 0 
catactaggc cttgaagttt ttgaactttg atgaagccgc atccagagaa caaaggaggc 390 0 
tgcaactttt ggagttggga gatatgagat taactactta tgaatcttca aggctataca 3960 
aagaaagggt caaaaagtat catgacaaga agctgctcaa gaaggacttt cagccaggac 4 02 0 
gacaagagtt gcttttcaac tcaagactta aattgttccc tggaaagctt acatcgaaat 4080 
ggtctggacc atttaccatc aagaaagtcc gcccatatag agcagtggag ctttgtgatc 4140 
ctcaatctaa agatcctgac aggacatggg tagtgaacgg acaaaggttg aatcaatatc 4200 
atggttcatg caatcctacc cctcaagggt attggataga agactccaag aggattgggc 4260 
tagagctgct aaagaaggcc ttggggttct catgaacccc agggtaaatt tctgagccca 432 0 
tggaccaagg ttgggtcctc tcttctttgt aaatattaga ataggttttt ccttcttctc 4380 
aggctaagca ccaatatgct tctgtttttc agtcctttga ataaggctaa gcgcagctgc 444 0 
tgcactaagc ccttgttgtg tgtcaaggag gttgagctaa gcgtgcccta ctgcgctaag 4500 
ctcaactatc tcactatttt tgtgttttta tggtcaggct aagcgcgccc tatgtgctaa 4560 
gcctaagggt cattctggtg agcgtgagct aagcgcgcca tgctgcacta agcttagacc 462 0 
cttttttgtt ttgaaaattt tagacttagg ctaagcccaa catgctacgc taagcctatc 4680 
tacagaaaaa tattttgtgt ctttaggcta agctcgagtc tactgcgctt agctcatgag 4740 
taatatttta taaggcgcgc taagcccagc ctgctgcgct aagtgcccag ttcagttttc 4800 
agctttaatt ttttgttttt gatagaaata atcttattta accttgtggt ttgattttat 4860 
tctttcagat agcatcaaag aagagaaagg cacctgccac accttcccag gtctgatatg 4920 
gccgatcgag gttcacttct cttgtggcct aggaaaggta cactgatatt gtggtaccca 4 98 0 
ggaagatact ccctgagtgg aatgtggtaa tctaccacac tgagtttgat gagtttaagg 504 0 
aagaactaga gagaagaaaa tgggatgagg aattgaccag ttttgatgaa ggcaacattg 510 0 
atgttgccat tctgaaagag ttttatgata acctctatga ttccgacgat aaatcaccta 5160 
agcaggtgag ggtgagaggc catttggtga agtttgatgc agacactctg aacactttct 5220 
tgaagacccc tgtgataatt gaagaggggg aaaagctgcc tgcctactct agatttgcac 5280 
tcttgagtcc tgatcctcaa gagttggctg ctaagctctg catcccaggg agggaatttg 534 0 
agcttaatgt tgacgacttg ccactaaaga tcctcaggaa gaaaatgacc acactcgctc 5400 
agactaggag tgttctttct tactccaact tggtccctac ctcccacact tctcacatca 5460 
cactggatcg ggccaagttg atttatggca ttatcatgaa gatggacatg aatttgggct 5520 
acctcatctc ccaccagatt tctatcattg cccagcatga ctcctctagg cttggattta 5580 
caaccttaat catagctttg tgtaaagcta aaggagtcac attagattcc aaatctttgg 5640 
agagtcttag ccctgccatt aacatggcat atataaagaa gaactgttgg aatctagatg 5700 
atccaacagt gacattcaga gagccaagga aggccagggg taaaagaatc gaggctcccc 5760 
ctacttcagc agcaccaggt gcttctgctc cttcttcatc ttctttacca gatccttcag 5820 
caccatccac ttcgactcca catcttccat ggttactagc ttcagctccc actcccttac 5880 
cagcttcaat tcagctcctt ctacaggacc ctcctcattc acctctaaga cattatttgc 5 94 0 
tatgctgcaa agcctgcaca aaggccagat catcatcata cagaggttgt agagctctgg 6000 
ccagaaacca accatgagta tagaggagtt ccttgcacaa gtggcttgcc caggagtcga 6060 
gccttctcct tctggagggg gtgaggcctt tgcagcccaa gagccttgcc agcagagaag 612 0 
cctgtgccag aagcagagga tgagcttgtt cttcctgagc catttgttta tgagattgat 618 0 
ccagtcgctc aggaggaagc agcagctcag gagcttcctg cacctatttc tgaggatacc 624 0 
ctgccatctg caccagcatt ggagtaagag cagcctagtt cacaggatcc accagctgct 6300 
ccaatgctgg atctgaacga gcatgcagaa gatcagcagt aggatgatca tgagttttaa 636 0 
attctacata gtttttaaaa ttttgcaaat tatgaatagt ttcttttatc aattatttag 6420 
ttcatgtcaa ttatttgttt atgctttatt agtctttaaa ttttagtctt ttaaattttt 6480 
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gttgtttgag tgttgatagc ttgtacaaaa gcatgtttga acagtgaact tattgattat 6540 
gatattcagt ggtgtgattt cttatgaatg aagtgtttgt gaatgacttg aatgagaaaa 6600 
tgtatgaatt gagtggactg gaatgattag atgtttgttt tgatcaagct tgtagtcatt 6660 
agaagaaaaa gaacatgtga ttagaagtat gactgaaaat gttagtcagt ttgtcaaatt 672 0 
gattgtgaag gaatgcattg accgtatccc agtgagagtg tgatccttaa attttgagag 6780 
aaatgacttt aatttagcac taatttttgc acgaatcttt gaagtatgga ttgaatgcat 6840 
gaattgagga taatgaaggc catgttttga ttgtgatagc tatttagcca aaaagctgac 690 0 
cttgtgcttg aatgatttat cccttgcacc cagtttgagc tgaatgaatt attgattgat 6960 
tgaaccttga gcctatatag tgttttctcc tgcttccttg tcttaggtta taggagagca 7020 
taatccacag aaaagcttgg ttcaaggcaa atttgttcca aatttggggg agacactggg 708 0 
taaagaaata aaatggtcaa aacagagcaa catatacaca ttgttttctg tatgtaaaaa 7140 
aaactgtaag tataaataaa aatgtataaa agtgtgtgtg ctgcaaatca aatcaatgaa 7200 
agctaagtgc ttaataaaag gcaagtatgg ggtaggaatg aataaaaaaa aaagtaaagg 7260 
tttatctatg gatgaatgct ctcgtagaat ctaagctttt gaatcctaga aaaaccatga 732 0 
tttgttggca gcctaacctc attacaagcc tagaaagtcc tttggattca ttttgtgtgt 7380 
ttatttctgt atggtatgag atgaaatgca aaagttagga cttgtgttag ttgttcatga 7440 
tggaatgagc ctaaacactt aagcttgagt gaaacaatga ctgtgaggct ttggttgatg 7500 
attttttcct tgatatctgt cattctcact agcttatttt agttgtgact ctaatgcata 7560 
tgttcctatc tttgaaaaac tgcatgtttg tgaaaagaaa ttggttgaag cattccatga 762 0 
tattcatttc atatgattga atttctctgt gaggagaaca ccatttggat tgaccactgt 7680 
attttgtcac ttgaggacaa gtgaactgtt ctttctttgc ttgaggacaa gcaaaacttt 7740 
aaatttgggg gagtatgtta gtcatcttat acgactaact tttgtataga aaaaattttc 7800 
caaaacttgt atagtttctc caatttatag ttattttgta gggatttgta aataaatctt 7860 
gttttattgt tatagttgtc tctagaatat tttccatttg atttaatgat gaaatctgtt 792 0 
caatttcagg ttaaaagagg ctaagtcttg aagtgctaaa agtgggattt acgctcagct 798 0 
caccatttgg cctcaacgcg catccaccgc taagcacagc ttcagcgcac ttagtgtgac 8040 
agaagaatct ggcagagcat aaatatcaag gccgcttgct aagcaagatg gttgtcttta 8100 
gccagactca gcgcatgact ggcgctaagc tcaaatccac taactcgcgc taagcacagg 8160 
ggtggcacta agtgcaacgt cgcggattta aagcctattt aaagcctgtc ttgtgcagaa 822 0 
ttaggtaata tacacacata gaattttagc aagcaataca aaattccaaa gcaaggacac 8280 
cacagtgcta atttcgatat agaagctctg gaggcagcaa gaggagaagc tttgcagaga 8340 
agcctaggat tcttcaatta gagagagatt agtgagctgt agagtgattg tgaggtgttg 84 00 
agaagaggag gagggatccc ccttcttgtg taaggaacaa ttatttggta ctctcaaact 8460 
catttgtgtt agggtttttc tgtaatggct agctaaacac ccttgttggg gatttctaag 852 0 
gaacaactga tgtaattact ttaatatcta attaattatg ttttatgtgt tcaatgcttc 8580 
tttcaatgct taattactgc atgctcttgg tctgatcacc catttgtgtg tattgttagg 8640 
tgactttagc attgggaaat gtaccgttgc cttagaactt gatagaagca ggactaaata 870 0 
actacattac cagggatgga ttatggggtt ttggttttct aaatatgttg tgatgataat 8760 
gctatttaag ttaagcctag tcatacaaga gggatctgcg gacgaagctt aggttaaatt 882 0 
agtataaact tacaagggat cgagatttag tactttaggc tacaacatag aacacaagaa 8880 
catgattaat tagagaaata tcctcatatg catcaacttg tttgttagaa agacccaacg 8940 
ctttttacct attgttgtca acttttactt acttgcattt tttttttacc atagaagtag 9000 
tttatttctg ttttaaccat caattatcaa tgttgttcca acaatgcctt acttctgaat 9060 
aaaactctgt ctaataagca agttccctaa attcgatact tggatcactc tgttttaatt 912 0 
ttaaatactt gacaactca 9139 



<210> 23 
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<211> 10482 

<212> DNA 

<213> Glycine max 

<400> 23 

tgttagtcgt cttatatgac taacttttgt atagaaaaac ctttttcaaa acatgtatag 60 
tttccccaat ttataattct tttgtaggaa tttgtaaata aatcttgata tgttttgata 120 
cctgccatta gagtatcttt agttggagtt aatgagaaaa tttgtacaat ttcaggtcaa 180 
aagaggctaa aatcttgaag tgctaaaagg agcagtcgtg ctaaatagag cctgtgggct 24 0 
cagtgcacat ccaccgctaa gtgcagcttc agcatgctta gcgtgacaag ggaacctgaa 300 
agagcacaag aatcaaggtc gcgcgctaag cgagacgttt gtcttttgcc aggctcagcg 360 
cacgactggc gccaagccca aatccactta ctcgcgctaa gcgcgatgtc gcgatttcag 42 0 
agcctattta agcctgaatt gtcagaatta gggtatgatt ttaagagacc agagctgtat 480 
atttttgcac aaacttcgag aatagtgctc tggaggcagc agagaggcag cagctaagca 54 0 
gggaagctag ggttcatcac tttgagagat tagagagtgt tttagtgatt gtgaggtgcc 600 
aagaagacga ggagggatcc cccttcctgt gtaagcaaca attgctctgt actttctgtc 660 
tcatttgtat tagggttcct tgtatggctt ggtaaaaacc ctagttgggg atttctaatg 72 0 
aacagttgat gtaattactt ttcatatcta attaattgtg ttttgtgtgt tcagtgcttc 78 0 
tttcaatact taattactgc atgctcttgg cctgatcacc ctcttgtgtg tactattagg 840 
tgactttagc attgggaaat gtagtgctgc catagaacat gatagaagca aggctaaata 900 
actgcattac ctaggatgga ttgtggggtt ttagttttct tattatgctg tgatgataat 960 
gttgtttaag ttaagcctag tccaacaaga gggatctgag gatgaagctt gggttaaatt 102 0 
agtctaaact tatgagggat cgaggtttag tactttaggc ttcagcatag aacacaagaa 1080 
catgattaat tagagaaata tcttcatatg cattaactcg tttgttagaa agacccaaca 1140 
ctttatacct attgctgtca actttttaat tacttgcatt tactgctttt taacatagca 1200 
tctagtttac ttttgtttat attctcaatt atcaatgttt gttcacacaa tgccatattt 1260 
ctaaataaaa ctttgtctaa taaacaagtt ccctgagttt gatactcgga ttattccgtt 1320 
ttaattttaa atgcttgata acctggtgcg ttttccgata tttcatttcc cttgaatata 1380 
ctgcttgtaa atttgataga aaggaactgt gttgaagggt aaacaaaaat ttgacacaaa 144 0 
gcatttatgg cgccgttgtc ggggaactgg attcattaga agagttcagt tcagttttaa 1500 
ggcattgctt tattttgttt tctttaattc attgattctt tttgctaaca ttttagttac 1560 
tgcacatttt attgttcttt ggaattggat aatttttgtt ttgtttcttt tgtatgcaaa 1620 
ggagatctgt tgtaggtgat ttaattccca tagatttgga gattaatgct acttgcagga 168 0 
gacaaaatgc agagagaatt agaaattttt tgcaggactt agaagtagca gcaactctag 1740 
gagagtgacc ctagaagatt actcaagtta aggccacagt ccaagcagct attagatgct 1800 
tctgctgggg gaaaaataaa gttaaagacc cccgaagaag ccatggaact cattgaaaat 1860 
atgactgcaa gtgacattac tattttgaga gatagagccc acattccaac aaaaagaagc 1920 
ctactagagc tttcatcaca agatgcattg ttggcacaaa acaagttgat gtccaagcaa 1980 
ttggaagcat tgaccaaaac actaagtaag tttccagctc aattacattc tgcacaatct 2 040 
ttaccatcta ctattttgca ggtcacagtg tgtgccatct gtggtggagc tcacgattct 2100 
ggttgttgta tccccaatga agaaccaaca actcatgaag tcaattacat gggtaaccaa 2160 
cctagaaata attttaatgc aggtggattt cccgaattcc agcatggaca gtaatacaac 2220 
caacaacagg gacaatggag gaccaccctg ggaattaatt caatagagac cagggtggac 2280 
cgtccacaag gccgtaacaa caagggccta gtctctatga gcgtacaacg aagttggaag 234 0 
agactctagc tcaatttatg caggtttcta tgtctaacca aaagagcacg gagtttgcca 24 0 0 
taaagaattt ggaagtccaa gtgggacagc ttgcaaaaca gttggtggat aggccgtcaa 2460 
agagctttag tgctaacact gagaaaaatt cgaaggggga atgtaaagct gtcatgacaa 2520 
gaagcagaat ggcaacccat gttgatgaag gaaaagctta gaagaaggtg gaggagcata 258 0 
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aacaacagtt ggcagctgag ccggcacttg 
aagttatgga agatgaagat gaccaaaagg 
atattagaaa aatgaaaaag aaaatgagaa 
ggtttcaaga gagaaaaaga gagagattac 
attggtacct tccaagaagg ataaagagcg 
gaagtcggag atcacattgc cttttggaga 
atttttaaaa gacatgctga caaagaaaaa 
ggaaggaaat tgtagtgctg tcactcaacg 
aagtgtcaca ataccatgtt ctattggtga 
gggagccagt atcaatttaa tgactctctc 
aatgcccact cgcatgaccc tacagttggc 
gatcgaggat gtgttgattc aggtcaagca 
gatatagagg aggatcctaa cattcccata 
agctgtgtag tagatatggg gaaaggcaaa 
tcattcgact tatttgaagc aatgaagcat 
gataaggtag aataggagat agaattagct 
gaaaaagcac gattaatcat gtagaatgtt 
cttgtattaa agagttggat ggtgcaggag 
aattgaagaa cagtgggaaa atagaaaaac 
attcgaagta tgtatcttgg aagacaatga 
gaagaaaaca gaagaagatc agttggtgca 
atggcacata tctgacttga aaggaattag 
ggaagctgat tacaaaccaa tgagacagcc 
99 a 99 t 9 c 9 c aaggaggtgc ttaagttgct 
gcgtgggtta gcccggtgca ggttgttctc 
gataaagatg aattaatatc cacaaggact 
cggaagttga ataatgccac ttggaaagac 
cttgagagac tcgcaaggca atcatattat 
tagattgcta tagatatcaa agatcaagat 
cgcgtgactt gcgcgtgcat gttccaagaa 
tatttgagga aaacgtcgga aaaaccggaa 
gttcgggagt tgtatttacg cacggggaag 
gatgacaacc tctaatcaaa tgtgcaaata 
tttcacgttc ttatgttttt tttatgcctt 
gggcgtttcc ctttgctcct acgtattcct 
gttcttttgt gaacaaagcg ttttggttaa 
ttttattgaa tgaaaggtca tttaaggtgt 
gaaaagtgag aaaacattaa ggcattggac 
taacaaagtt acatattgat tttaggcttt 
gaaaagacca tttcaaggcg ttggaccttt 
gtttggttta tgaattgatt ttagccttag 
aagaaagaga aatcccaaag aaaaacgtcc 
atatttttga ttattatatt attattttac 
gaccgaacag tcggatttca ttttaacaga 
gtggaaattt attttatttt ttgattaggc 
acgtcaaaag ggggtacgga aagtaaatga 
ggaccactaa gggtacatag aatgaattgt 
gaagaacgac gaagaacgaa cgaagaacgt 



aacccatttc tgattttgtt gaacttgagg 2640 
aaaagagaaa gaagaagtag aaaaagaaaa 2700 
ggttgaggaa agaaagagga gcaagagtga 2 760 
ttcagctgaa ggcaaggatg taccatatcc 282 0 
acacttagcc agatttcttg acatcttcaa 2880 
aactctccaa cagatgccac tctatgccaa 2940 
ctggtatatc cacagtgaca cgatagctgt 3000 
catccttcca ccaaagcata aggatccagg 3060 
agttgcagta ggcaaggctc tcattgactt 312 0 
catgtgccag caacttggag agttagagat 3180 
agatcgctcc attgctagac catatggagt 3240 
gcttgtattc cctgcaattt tgtggttatg 3300 
atcttgggac gtcctttcat gtccacgacc 3360 
ttagaactgg ttgtggagga tcagaaagtc 342 0 
ccaaatgatc aaaaagcttg ctttgatctg 348 0 
gctatagcca tggtactgca ctctcatttg 3540 
tgaccaagga ggaggaacat gaagtgtaga 3600 
aaaattccga gggacatact gcatttgaag 3660 
caaaagtaga attgaagact ttgcctgcac 3 72 0 
ctccaaacca gtgattatta gcagctcttt 3780 
gattttgaag aaacataaag ctacaattgg 3840 
tccatcttat tgcatgcaca aaattattat 3900 
tcaaagaaga ctgaacccaa tcatgaaaga 3960 
agaagcaggc ctcaccccat ctcagatagt 4020 
aagaagggag gtatgacagt cattaaaaat 408 0 
gtcaccgggt ggagaatgtg cattgattat 4140 
cattatccac tccctttcat ggaccatatg 4200 
tgttttctgg atggatattc tagttacaat 4260 
gtcgcaacct acccttcagt gggagggcga 432 0 
aggaatacgc gcggagtcgc caccaacgtt 4380 
aagacgtgat ctacgaactt taagtgaaag 4440 
gtattagcac cccacacgtc cgtcacaaga 4500 
tgacttcaat ttatgttatc ttcccccttt 4560 
tttatgtttt tatctttttg tggttgacaa 4620 
caattgtgat gagaaaatca aacctacgta 4680 
gttatttttt atcctttttt gcaagatatg 4740 
tggaccatta gacaatcttt cgattctttt 4800 
cattaatgat ttctttattt ttgaaagagt 4860 
ttagaaatct acacttaacc aataaaagcg 4920 
gaaaaatggc gtttttaggc gatgacaaaa 4980 
tttcactttg gttattagtc gattcgattt 5040 
gattgatttt ttgatttatt ttactaaaag 5100 
ctatttttgg ttttcaacgg gttacggcat 5160 
aattaacgga tgttacaatt taaatgatcg 5220 
gagaaaatga cttaagtaaa tgactaaagc 5280 
aatgaaaata aaagcatgtg aaacaaatga 534 0 
ttgatttcgg gaacttaccg gttgaagatc 5400 
cgatgaacgg ttgaaaatct tcgcaaaatc 5460 
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acccacggaa acgttacgga agcacctcgg cttggatttt cttcacggaa acaatttttc 552 0 
tcactaattt taagtgaatc tcagatacca ggagggtcga acatttttgt tcttccctcc 5580 
ttcccttatt tataggaaaa ggaaggagat gcttgccacc cagctcgccc aggcgagcta 564 0 
ggttgcttcc tccagaagca aatcctggaa ggcccaagtg ggcctggttg ctatttgaac 5700 
ccccaatttt actaaatata ccccctgcct ttttttggtg attctttttc cgtaaagtta 5760 
tggaaactta cgaatttcgt aacgatactt gttttctttc cgtaatgttg tggaacctta 582 0 
cggattacgt aatcatccct tttttgcctt ccggaacgtt acagaacttt acggattgca 5880 
cactaacact tccttttaat tttcggcatg tcacgaactt cacggattgt gctaccacgc 594 0 
ttttcttttg gcttccgaca tgtctcggaa cttcacaaat tgcctaacca tgggtgccaa 6000 
atacctcgaa gtggtcaaac gacggtcgca tcccaacaac ggatggttct cggacgaaat 6060 
tagggtatga cacaagagaa gacaactttc actttccctt tcggtgtatt tgcatatcga 612 0 
tgcatgcctt tcggtctatg caatgcccta gctacatttc agaggtgtat gatggcaatt 6180 
ttttctgata tggtggaaaa atgcattgaa gttttcatgg acgatttctc tgtttttgga 6240 
ccatctttga tggttgctta tcaaatctgg aaagagtatt ttagagatgt gaagagtcca 63 0 0 
acctggtact taattgggaa aatgtcattt catggttcaa gaaggaatag tgctggggca 6360 
taaaatatca gtaaggggaa ttgaggtgga taaggtgaag attgatgtca ttgagaaact 642 0 
tcctcctcca atgaatgtca aacgaatgag aagtttctta ggacatgatg gattctatag 6480 
gtgacttata aaagattttt caaaagtcgc caaaccactt agcaatttgt tgaacaaaga 654 0 
tgttgctttt gtgttcaatg gaaagtgtat tgaagcattt aatgatttga aaaccagact 6600 
agtgtctgct ccagtaatta ctacaccaga ttgggggtaa gaatttgagt tgatgtgtga 6660 
cgcgagcgat tatgctatag gtgcagtgct tggacaaagg aagggcaaaa tttttcatgc 672 0 
tatctactac gccagcaaag ttttaaatga tgcacaggtt aactatgcta ccacagaaaa 6780 
agaaatgttg gcaattgttt atgcacttga aaagttcaaa tcttatttgg taggctcaaa 684 0 
agtcatcatc tacattgatc atgcaactat taaatatttt ctcaacaagg ccaattccaa 6900 
aaccctgctt aataagatgg attttgctgc tgcaagaatt tgatttggta attcgggata 6960 
aaaagggatc ggaaaatgtt gtagctaacc aatttgtcta gattggggaa taaagaagtc 7020 
atgtcgaaag aagctgaaat tagagatgaa ttccctaatg agtcattatt cttggtgaat 7080 
gagagacctt gatttgctga tatggccaac ttcaaagccg caggaatcat tccaaaagac 7140 
ctaacttggc agtagaggaa gcaattcctg catgatgctc gattttatat ctgggatgac 7200 
ccgcacttgt tcaagattgg agttgacaat cttctccgaa gatgtgtgac acaagaagaa 7260 
gccaagaaca tattatggca ctgtcacaat tctccatgtg gcggccatta tggtggagat 732 0 
aagacgacga ccaaggtttt gcaatctgga ttcttttggc ccacactttt caaggatgct 7380 
catcagaata tgctgcattg tgatcaatgt caaaggatgg ggggcatatc aaaaagaaat 7440 
gaaatgcctt tacagaatat tatggaggtt gaggtatttg actgttgggg gattgatttt 7500 
gtaggtccct tccctttgtc ttttggcaat gaatacatac tagtggttgt tgactatgtc 7560 
tctaaatggg ttgaagcagt ggctaccctg cataatgatg ctaagattgt ggtaaagttt 7620 
ctaaagacga acattttctc cagatttggg gtgcccagag ttttgattag tgatggaagc 7680 
acacatttct gcaataataa gatacagaag gtgttgaagc aatataatgt aacacacaag 7740 
gtagcatcag cttatcaccc ccaaaccaat gggcaagcag aagtgtcgaa caaggaattg 7800 
aaaaagattt tagagaagac tatggcttct actagaaagg actggtccat taaactagat 7860 
gatgctttat gggcgtatag gactgcattc aagactccga taggtttatc tccatttcag 7920 
atggtgtatg gcaagtcttg tcacttacca gtggagatga aatataaaac atattgggcc 7980 
ttgaagttgt tgaactttga tgaagccgaa tccagagaac aaaggaggct acaacttttg 8 040 
gagttggaag agataaaatt aactgcttat gaatcttcac agttgtacaa agaaaaaatt 8100 
aaaaagtatc atgataaaaa actgctcaag agggattttc aacaaggaca acaagtgttg 8160 
cttttcacct caagacttaa attgtttcct gggaagctta aatcgaaatg gtctagacca 822 0 
tttaccatca agaaagtccg aacatatgga gcagtggagc tttgtgatcc tcatatgggt 8280 
ggtgaacgga caaaggctaa agcaatatca tggtggagct attgagagat tgaacactat 834 0 
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tctacacttc aatccaggat aacaggacga tgcgtcaagc taatgacgtt aaccgagcgc 8400 
ttacggggag gcaacccagg tctcttttta tttctatttt tcttgcattt aatttagtta 8460 
gtttaattgc ttgtgattgt aaatgatttc taagcttggt tagtattgag aaaagggttt 852 0 
caaagtttta gtaaagagat ggatagaaaa gacttagaga aaaaattttc agttgtccat 8580 
ccgctaagcg cagcccttgt gctaagtgcc atgtcttaat gcactaagca tgtgcttgct 8640 
tgcgctaagc actttgacct ttcaccagtt ggctagatgg ttcagctaag cgcacatcac 8700 
tgcgctaaac ctaagttctt ctctggattt gaacttcatg acttgggctt agaggagttg 8760 
atgcgctaag cgcaactcct tctctgttga aaaattattg taatagcatt aagcttaatt 8820 
tcctctctgg aattgaactt tcaggaattg ggcttagcag caggatacgc taagcgccaa 8880 
tccttcacta ttttgaaata cttggaattg cgctaagcct ggaaccatca ctgtaagtag 894 0 
agcttgtttt agtgctaagc ctaacatctt aggctaagtg aaaattgcag gaccaatcag 9000 
agttgcagac agtgctaagc gcgtgtcctc gcactaagct tgaatacctc tctggaattt 906 0 
gaaattattg aattaggctt aacgcgagag gtggcgctaa gcgcatgggc cttaaactca 912 0 
aatgtcatgt tggcatgcta agcgcaacta tgcgctaagt gcgccaaaca aaaatgctaa 918 0 
aataaaatag aactaccaat ggcagttacc atttacactt caaagctttt actcccttat 924 0 
gcttgtgccc acattcgtgc ttttgtgcat tttgctgcct ttgcttcaag ttattcctgc 9300 
tttcttgctc tcatcttgca tttccatcac aatccaagta agttttcatg tttattttca 9360 
ttttctttta taagcttaaa ccttagggta gatgatttag tgctttttag tttgcaattt 9420 
tttttaggtt tagtgttttt aggttagttg ttagttaagg taggtttagg gtttacaatg 9480 
taggttttag gttaggtttt tgagcccctt aggggcaatg cctgaaaaag gggtgaaaac 954 0 
ccgtgagtaa tttctagaaa tagcgatgaa cgtgctaagc gcacctgctg tgcttagcca 9600 
gttcatcgca acttccttct aatgagtttc aatgatgagc tcgataagcg cgtttgtgcg 9660 
ctaagtgaga caagtgtttt agacacttag tatttttttc aatttttgtt cagcactaaa 972 0 
gcctggcttc tcaggctaaa gcacaattct gtctttattt ttcaattgtt ggaataaggc 9780 
taagtgcagc ttgttgtgct aagcccatgt tatgtcttag tgaggttgag ctaagcgtgc 984 0 
cctactgcgc taagctcaat tcctccactg ttttcaaaag tgtggattta ggataagccc 9900 
agcttgttgc gctaagccta gtctatggaa aaacattttc tgagtactca cgctaagcgt 9960 
gtggctatcg ggcttagccc atgagtaaat tttcataaag cgcgctaagc ccagccttct 10020 
gtgctaagca cccagtccta ctttcagttt tatttttttg tttttgttga ataatcctgt 10080 
tttaactctg ttgtttgatc taattctttt cagatggcat ctaggaagag aaaggcccat 10140 
gcctcaacat cccaggcccg ctatgataga tccagattca catctcagga ggcctgggat 10200 
cgttattcta gtgttgtcat tggcaggaaa atattacctg aaagaaatgt catgctctat 10260 
tacacagagt ttgatgaatt cactgaagag ttagagagaa gaaacaggca caaggagtta 1032 0 
acaaatttta tggatggcaa cattgatgtt gccattatga aggagttcta tgctaacctc 10380 
tatgacccag aggataaatc acctaagcag gtgaggttca gaggtcattt agtgaaattt 10440 
gatgcagatg ctctgaacac tttttttatg acccctgtga tc 10482 



<210> 24 
<211> 1857 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 24 

atgagcaatt acagtggcag ttcttctgtt 
tcgtcatctt caaggccaga gagagaacag 
gagatagccc gaggaaagag agcgatgaga 
gaggacgagt acatgcctga acagactcgc 



gatcctgact acaacatgga tgagacagaa 60 
agagaatacg aaagtttcag aaggaaagct 12 0 
gagaggtatg agcttataga cgaagatctg 180 
agagctacca aacttctgca caagcccgac 240 
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atattgcctg ctgaggaata tgttaggctt 
tatccttgct cgacctcact tgcacaactc 
caaagttgtc atctggacac tttgatggct 
atacaattcc tctccacact acaagtagag 
gattgtgaag gattgggatt cttgcgattt 
atcaagcgat tggaaggatt gtttgatttt 
gaaagagaag agttgaaaga cttgtggatc 
tccaggtcaa agagcaatca gatacgcagc 
gccaacgtac tctactcccg agagattaca 
atcgcaatgg ccctcaaagg aactctccgc 
gaagtcaatg acacacctct ctctatactt 
tgggcggtca gcaataaccg caagagagca 
acacctattc tgatagcttg tggagtccca 
atggatatcg agcacctacg tcactgccaa 
cacaggttca ggtttgagca ctctacagac 
gaggtcacac ggataatcga gggagataac 
tactatgaga acgctccacc attagatgag 
gggatggatg aagatggagc agtaaagttc 
gtacctccag cgaggcagag caagagcttg 
cagaagtggt gcaagaagca ggacaggctg 
aagctgagtt gctcttcctc caccactgct 
ccatcgagga gaattaatgc acctgcgcac 
catgtccagg ctaggcattc gtcattcgaa 
acactcactc gatctagcag cagatcacgc 
ggtgctggcc gcagcagaag gagagatgtc 
agagctgatg aggtcgagta cccatctgct 
atggcctggg agcaatcgca ggcagccatt 



ttcaagctga atgagttctg tagcacgagg 300 
ggattgttgg aagatgttca gcacctgtac 360 
tatccgtatg tagcatatga agatgagaca 420 
ctctaccaag gtatgacctc tgatgagttg 480 
tctgtgtatg gtcatgagta caggttatca 54 0 
cccagtggaa cgggatctaa gccaaagtat 600 
accatcggca gctctgtacc gttgaatgct 660 
cctgtcatca ggtacttcca gcgttctgta 72 0 
gggactgtca ctaactctga tatggagatg 78 0 
caaactaaaa atggcatgtc cctccagggt 84 0 
cttctgatcc atctgtgtgg atacaaaaac 900 
cgaggcgctc tgtgcatagg tggcgtggtg 960 
ctcatttctg ctggactcga gccacgagca 102 0 
ttcctggagt ttgcaatggt tgacgatttc 108 0 
aggagagcta acatccttct ccctagccct 1140 
attgatttta ggcctgagat tggacgcctc 1200 
gacgatcttc ttgaagaagc tgcttcggat 1260 
gacactagca tgtatcactt tgctgaacat 132 0 
actgaagctc ataagaatta cagtaaattg 1380 
atcgccaagt gtttcaagct tctgacagac 1440 
attccacagg tacaacctcc tatggaaatg 1500 
aggcctgagc ttagcgagca gagagtccca 1560 
tcccgggaac acaagagaag aaggaaggct 162 0 
ctcattcact cgaggagatc actcgaccgt 168 0 
gagtttcctc agagcggtgc tggccgccac 174 0 
ggagctgata cagaacaagg aggttcgtct 1800 
gacgagcaac tacgttcatt cttcgac 1857 



<210> 25 
<211> 1254 
<212> DNA 

<213> Pi sum sativum 
<400> 25 

atggaatcca ggtccggagc ttcgaaaaag 
cccatacaat tcgacaccga caaatttgtc 
ttggaaaagc gaaagatttt gccggaaaag 
cgtacattcg ccgggctgat taacagcaaa 
cattacgaca tcgcaacagt gcgtgagttc 
ccattcacat ggacgtctag agtgtccggc 
aaccgtgtcc tgggtgaacc gctccatctg 
gatttaaggc ttcaccggga taccgattcg 
tcagttgagc tgaacccatc tggggttccg 
ttggctcaac tgatcctttt gttggttctt 
accgtgccga tcccagtggc acacttggta 
gtggcaagga ttattgcttt ggagttgaag 
gaacgagtga attgtcccct tgctttccct 



agaaagggcg ggaatagttc ccgtcccgtg 60 
gggccaaagc aagcagtaag atatgttgct 120 
agatttataa tcaaccctga aggcacgaac 180 
aagtgggacc ggttaatatc ccccttgaag 240 
tacgcgaacg cactgccgaa cgacgacgag 3 00 
cgtcctgttg cgttcgatcg ggatgcaatt 3 60 
ggagccaatg agagagacac ttaccaccaa 42 0 
atttctactg ccctgctttt ggaagggaaa 480 
atgagatacc atagggagga catgattccc 54 0 
acaaacatca aacccaagtc tcacacttct 600 
cacatcatcc tcacgaatat ccagattgat 660 
tccgtgattg aaagcgggct aaagtcgggg 72 0 
tgtctaatca tggctttgtg ccaacaagcg 780 
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agggtgaggc taccctccaa gggtcaagta 
gtggccaagt actgcaaacc gaagaatgta 
gcttctgatg gtcctggtac ttttactcta 
aactacaact gggattggat ggcggcaact 
atgcagctgt tacagttgca gatgcgcgac 
gagcagtttc tgcagcacgc tagctggcct 
ggtgctggtg caactggtgc tggtgctttt 
gaggctaccg gttctgaagc cggtagtgat 



aggatcccgc cggccattga tgaccgatac 84 0 
agaagtagtt cagctgctga ggttaccggg 900 
ggatccgatc ctttccagca ggctgtctgc 960 
cagcgcgtca tgctcgatat gcacgattct 1020 
ccctccggtg agcattctat gatgtcacgt 108 0 
gtggacaggc ctgtgtttgg agagggggcg 1140 
tctggtgctg ctgatgatga tgatgatgat 12 0 0 
gagggttatg agtccttgga gggc 1254 



<210> 26 
<211> 564 
<212> DNA 

<213> Arabidopsis thaliana 



<400> 26 

tgtgattcat gccagagaaa aggcaacatc 
atcttggaag ttgagatctt tgatgtatgg 
tcatacggta ataaatatat actggtcgcc 
attgctagtc ctaccaacga tgcaaaagtt 
ccaagatttg gagttcccag ggtagtaatc 
gtttttgaga acctcttgaa gaagcatggg 
ataaaaacaa ttctggaaaa gactgttggg 
gatgatgcat tatgggctta caggacagct 
aatcttctct atggaaaatt atgtcatcta 
gcggtaaaac ttctgaactt tgac 



aatagaagaa atgagatgcc tcagaatcca 6 0 
gggattgatt ttatgggtcc attcccatct 12 0 
gtagactacg tatcaaagtg ggtcgaagct 180 
gtgctgaagt tgttcaaaac cataatcttc 240 
agtgatggcg gaaagcattt catcaacaag 3 00 
gtaaagcagg ttgagatctc caatagggag 360 
attacaagga aagactggtc tgcaaagcta 42 0 
ttcaagaccc ccataggtac aactcctttc 480 
cccgttgagc tcgagtacaa agcaatgtgg 54 0 

564 



<210> 27 
<211> 180 
<212> DNA 

<213> Arabidopsis thaliana 



<400> 27 

atcgaggaga tggtggaggt tttcatggac 
tcatgtttgt tgaatcttgg cagggtattg 
aattgggaaa agtgtcattt catggtgaag 



gatttttcgg tctatggccc ctctttctcc 60 
actaggtgcg aagagacgaa tcttgttctc 12 0 
gaaggcatag tattggacca caagatatca 180 



<210> 28 
<211> 192 
<212> DNA 

<213> Arabidopsis thaliana 



<400> 28 

tttgaaatca tgtgtgatgc atcagattac 
gacaagaagc ttcatgtcat atattacgcc 
tatgcaacaa ctgagaagga gcttctagct 



gcagtaggag ctgttctagg ccagaaaata 60 
agccgaacgt tggatgacgc tcagggaaga 12 0 
gttgtattcg catttgagaa gttcagaagc 180 
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tatttggttg ga 



192 



<210> 29 
<211> 597 
<212> DNA 

<213> Pisum sativum 
<400> 29 

ttggatgcga gaatgattta cccgatctcg 
gttccgaaga aaggtggaaa taccgtcatc 
aaagttgcaa cggggtggag aatgtgtatt 
aaggaccatt ttccactccc gttcatggat 
tactattgtt tcttggatgg ctattccggg 
cattaaaaga cggctttcac atgtccgttt 
gggttgtgca atgcaccgac gactttccaa 
aatgagaaaa caatggaagt cttcatggat 
ttatgcttgg caaacttgaa aacggtgctt 
aattggtaga agtgccactt catggtgacc 



gatagtccat gggtcagtcc cgtgcatgtg 60 
cggaatgaca aggatgaatt gatccctacc 12 0 
gaatataggc ggttgaatac cgcaactcga 18 0 
caaatgctgg aaagactctc cgggcaacaa 24 0 
tataaccaaa ttgccgttga cccggccgat 3 00 
ggagtgttcg cataccgaaa aatgtccttt 360 
cgatgtgtgc aagccatttt tgccgacctt 42 0 
gacttctcgg tatttggtgt atcctttagt 480 
gaaagatgtg tgaagaccaa tcttgtgctt 540 
gaggggatag tgcttggcca taaagtc 597 



<210> 30 
<211> 192 
<212> DNA 

<213> Pisum sativum 
<400> 30 

tttgagctaa tgtgtgatgc gagcaactat 
gagaaaaaat ttcatgcgat acattacgca 
tatgccacca ctgaaaaaga attacttgcg 
tatcttatag gg 



gcaatcggag cggtattagg ccaaagaaaa 60 
agtaaagttc ttaatgaggc tcaaattaac 12 0 
atagtgtatg cacttgaaaa gtttaggtct 180 

192 



<210> 31 
<211> 581 
<212> DNA 

<213> Pisum sativum 
<400> 31 

tgtgatagtt gccagagaag cggtgggatt 
atccaagagg tcgaagtatt tgattgttgg 
cttatggtaa cgagtatatg cttgtcgcag 
cgaaaacggt aataattttt ttgaagaaaa 
tgttgataag tgacggaggg tcacactttt 
attacggtgt atcacacaga gtggcaactc 
aggtctctaa tcgtgagatt aagagaattc 
agtggtcaca aaaattggat gaagcgttat 
ttgggctcac tccttttcaa ttggtgtttg 



ggtaagagag acgagatgtc tctccaaaac 60 
ggcatcgatt ttgtaggacc attcccccct 120 
ttgaggcgat tgcctcacct cgggcggatg 18 0 
acatattttc ccgtttcgga accccccgag 240 
gtaatgcacc gttggaaagc attttaaaac 3 00 
cgtatcaccc acaggctaat ggacaagccg 360 
tcgaaaaaac tgtgtcaaat tcgaaaaaag 420 
gggcataccg taccgccttt aaagctccaa 480 
gtaaaacttg ccatttgccg gtcgaattgg 54 0 
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agcacaaagc cttgtgggct ttgaaaatta ataattttga a 



581 



<210> 32 

<211> 1362 

<212> DNA 

<213> Glycine max 

<400> 32 

atggcctcct gtaaacaccg agctgtgccc 
tcacgtttca ctttcgagat tgcttggcac 
atccttccag agaggaatgt agagcttgga 
ctccagaggc tcagatggga ccaggttctg 
gctctggtga aggagtttta ctccaaccta 
tggagtgttc gaggacaggt tgtgagattt 
accccggtca tcttggcaga gggagaggat 
cctccagacc atgatgccat cctttccgct 
aatgttgata gtgccccctg gaagctgctg 
tggagtgtgc tctcttattt taaccttgca 
gacagggccc gactcaatta tggcttggtg 
atttctcttt agatcagtca gatcgcccag 
ttgatcacaa cactgtgtga gattcagggg 
ctcagtcctg tgatcaacct tgcctacatt 
tctatcacat ttcaggggac ccgccgcacg 
gctcctcttc catcccagca tccttctcag 
ctatccacct cagcacctcc atacatgcat 
cagcagatca tcattcagaa cctgtatcga 
ctcatgactc cggaggccta tcgtcagcag 
gacagggggg aagagccttc tggagccgct 
ctcatagctg acttggctgg cgctgattgg 
tgatcttatg ctttaatgtt ttcttttata 
ttatgttttt atgtagtctg tttggtaatt 



acacccgggg aagcgtccaa ctgggactct 60 
agataccagg atagcattca gctccggaac 12 0 
ccagggatgt ttgatgagtt cctgcaggaa 18 0 
acccgacttc cagagaagtg gattgatgtt 24 0 
tatgatccag aggaccacag tccgaagttt 300 
gatgctgaga cgattaatga tttcctcgac 360 
tatccagcct actctcagta cctcagcact 42 0 
ctgtgtactc cagggggacg atttgttctg 480 
cggaaggatc tgatgacgct cgcgcagaca 540 
ctgacttttc acacttctga tattaatgtt 600 
atgaagatgg acctggacgt gggcagcctc 660 
tccatcactt ccaggcttgg gttcccagcg 720 
gttgtctctg ataccctgat ttttgagtca 780 
aagaagaact gctggaaccc tgccgatcca 84 0 
cgcaccagag cttcggcgtc ggcatctgag 900 
cctttttccc agtgaccacg gcctccactt 960 
ggacagatgc tcaggtcctt gtaccagggt 102 0 
ttgtccctac atttgcagat ggatctgcca 1080 
gtcgcctagc taggagacca gccctccact 114 0 
gctactgagg atcctgccgt tgatgaagac 12 00 
agcccatggg cagacttggg cagaggcagc 12 60 
ttatgtttgt gttctctttt atgttttatg 1320 
aaaaagaggt ag 1352 



<210> 33 

<211> 192 

<212> DNA 

<213> Glycine max 

<400> 33 

tttgagttga tgtgtgacgc gagcgattat 
ggcaaaattt ttcatgctat ctactacgcc 
tatgctacca cagaaaaaga aatgttggca 
tatttggtag gc 



gctataggtg cagtgcttgg acaaaggaag 60 
agcaaagttt taaatgatgc acaggttaac 120 
attgtttatg cacttgaaaa gttcaaatct 180 

192 



<210> 34 
<211> 597 
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<212> DNA 

<213> Glycine max 

<400> 34 

tt: 9gaggttg ggctcatata ccccatctct 
gttcccaaga aaggtggaat gacagtggta 
cgaacagtca ctggctggcg aatgtgtatt 
aaggaccatt tccccttacc tttcatggat 
tactactgtt tcttggatgg atactcggga 
caggagaaga cggtctttac atgccccttt 
gggttatgta atgtaccagc cacatttcag 
gtggagaaaa gcatcgaggt atttatggac 
agctgtttga ggaacctaga aatggtactt 
aattgggaaa agtgtcattt tatggttcga 



gacaacgctt gggtaagccc agtacaggtg 60 
caaaatgaga ggaatgactt gataccaaca 12 0 
gactatcaca agctgaatga agctacacgg 18 0 
cagatgctgg agagacttgc agggcaggca 24 0 
tacaaccaga tcgcggtaga ccccatagat 3 00 
ggcgtctttg cttacagaag gatgtcattc 360 
aggtgcatgc tgaccatttt ttcagacatg 42 0 
gacttctcgg tttttggacc ctcatttgac 480 
cagaggtgcg tagagactaa cttggtactg 54 0 
gagggcatag tcctaggcca caagatc 597 



<210> 35 

<211> 603 

<212> DNA 

<213> Glycine max 

<400> 35 

tgtgataaat gtcagagaac aagggggata 
atcatggagg tagagatctt tgatagttgg 
tcatacagga atgtctacat cttggtagct 
atagccacgc tgaaggacga tgccagggta 
tcccatttcg gagtcccacg agccttgatt 
cagttgaaga aagtcctgga gcactataat 
actcagacga atggccaagc agaaatttct 
acagttgcat catcaagaaa ggattgggcc 
aggacagcgt tcaagactcc catcggctta 
tgtcatttac cagtagagct ggagcacaag 
gac 



tctcgaagaa atgagatgcc tttgcagaat 60 
ggcatagact tcatggggcc tcttccttca 120 
gtggattacg tctccaaatg ggtggaagcc 18 0 
gtgatcaaat ttctgaagaa gaacattttt 24 0 
agtgatgggg gaacgcactt ctgcaacaat 3 00 
gtccgacaca aggtggccac accttatcac 360 
aacagggagc tcaagcgaat cctggaaaag 420 
ttgaagctcg atgatactct ctgggcctat 480 
tcaccatttc agctagtata tgggaaggca 54 0 
gcatattggg ctctcaagtt gctcaacttt 600 

603 



<210> 36 

<211> 150 

<212> DNA 

<213> Glycine max 

<400> 36 

cctaaaatac tacaacgaca tgattggtgt tttaggataa ttgactgaaa aacctattat 60 
caatttggcg ccgttgccaa ttgggtgttt gtttgttaca tttgagattt cagacttgct 120 
tagatcaagt tctttttcaa ttttcttttt 150 



<210> 37 
<211> 11 
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<212> DNA 

<213> Glycine max 

<400> 37 
tggcgccgtt g 

<210> 38 

<211> 15 

<212> DNA 

<213> Glycine max 

<400> 38 

tggcgccgtt gccgg 

<210> 39 

<211> 27 

<212> DNA 

<213> Glycine max 

<400> 39 

tttttggcgc cgttgtcggg gattttg 

<210> 40 

<211> 9 

<212> DNA 

<213> Glycine max 

<400> 40 
tttggggga 

<210> 41 

<211> 16 

<212> DNA 

<213> Glycine max 

<400> 41 

tttaatttgg gggatt 

<210> 42 
<211> 775 
<212> DNA 

<213> Nicotiana tabacum 
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<400> 42 

gtgcgtaaag aggtttttaa actggagatt 
cccatttacg atagttcatg aacttctccg 
cggtggtcac caatgagaag aatgagttga 
tgtgcatgga ctatcgcaag ctcaacaaac 
tccttgacca aatgcttgat aggttggcat 
agtcgggcta tagccaaatc tttattgctc 
tccctatggt acttttgcct acaagcggat 
cttttatagg tgtatgatgg ctatcttcac 
catggatgac ttctcgatgg ttggggattc 
agtattggca agatatgaag aaacgaattt 
gatcgaggaa ggcattgttc ttggccacaa 
ggcaaagatt aaggtgattt ctaaacttac 
tttcttaggc cacgcggggt tttaccaatt 



atcaagtgat tggatgccgg ggttatctac 60 
gtgcaatgtg tcccaaagaa ggtggcatga 120 
ttcctacaag aatggtgacc ggttggagag 180 
tcacaaggaa ggatcatttc ccatttccat 240 
gtcgtgcttt ctattgcttt ctagatgtat 300 
cgtaggatca cgagaaaata cctttacatg 360 
gccatttggt ttgtgtaatg cactagcgaa 42 0 
ggacatggtg aaggactacc ttaaagtttt 48 0 
ctttgatgat tgcttggaaa atttggataa 54 0 
ggtactaaat tgggagaagt gtcatttcat 600 
gatctcaaat aatggcattg aagtcgacaa 660 
acctccaact ttggtgaaag gcgtgcggag 72 0 
cttcataaaa gatttcacaa aggtt 775 



<210> 43 
<211> 259 
<212> PRT 

<213> Nicotiana tabacum 
<400> 43 

Val Arg Lys Glu Val Phe Lys Leu Glu He He Lys Glx Leu Asp Ala 
15 10 15 

Gly Val He Tyr Pro He Tyr Asp Ser Ser Glx Thr Ser Pro Val Gin 
20 25 30 

Cys Val Pro Lys Lys Gly Gly Met Thr Val Val Thr Asn Glu Lys Asn 
35 40 45 

Glu Leu He Pro Thr Arg Met Val Thr Gly Trp Arg Val Cys Met Asp 
50 55 60 

Tyr Arg Lys Leu Asn Lys Leu Thr Arg Lys Asp His Phe Pro Phe Pro 
65 70 75 80 

Phe Leu Asp Gin Met Leu Asp Arg Leu Ala Cys Arg Ala Phe Tyr Cys 
85 90 95 

Phe Leu Asp Val Glx Ser Gly Tyr Ser Gin He Phe He Ala Pro Glx 
100 105 110 

Asp His Glu Lys Thr Thr Phe Thr Cys Pro Tyr Gly Thr Phe Ala Tyr 
115 120 125 

Lys Arg Met Pro Phe Gly Leu Cys Asn Ala Leu Ala Asn Phe Tyr Arg 
130 135 140 
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Cys Met Met Ala 
145 

Phe Met Asp Asp 

Glu Asn Leu Asp 
180 

Leu Asn Trp Glu 
195 

Gly His Lys He 
210 

Lys Val He Ser 
225 

Ser Phe Leu Gly 



Thr Lys Val 



He Phe Thr Asp 
150 

Phe Ser Met Val 
165 

Lys Val Leu Ala 



Lys Cys His Phe 
200 

Ser Asn Asn Gly 
215 

Lys Leu Thr Pro 
230 

His Ala Gly Phe 
245 



Met Val Lys Asp 
155 

Gly Asp Ser Phe 
170 

Arg Tyr Glu Glu 
185 

Met He Glu Glu 



He Glu Val Asp 
220 

Pro Thr Leu Val 
235 

Tyr Gin Phe Phe 
250 



Tyr Leu Lys Val 
160 

Asp Asp Cys Leu 
175 

Thr Asn Leu Val 
190 

Gly He Val Leu 
205 

Lys Ala Lys He 



Lys Gly Val Arg 
240 

He Lys Asp Phe 
255 



<210> 44 
<211> 761 
<212> DNA 

<213> Nicotiana tabacum 
<400> 44 

gtgcgtaaag aggtggtcaa gctgttggat 
tcttggactt cgccggtgca atgtgtacca 
tccaaaaatg agttgattcc gacaagaacc 
cgcaagttga ataaagtgac ctgcaaggat 
ctagatcgac ttgctgggcg tgccttctat 
caaatcttga ttgctccgga agatccggaa 
tttgttttct ctaggatgcc ttttaggttg 
atgatggcca ttttctccta tatggtgaaa 
agtgttgtgg ggcactcatt tgatgaatgc 
tgtgaagaaa ccaatcttgt cctcaattgg 
atcaatctct ggcataaaat ttcaaaacat 
tgatttcaag gctccctccc cctacatccg 
cggggttcta ttggagattc ataaaagact 



gtcggggttg tgtaccccat ctctgatagc 60 
aagaaggttg gcatgactgt ggtgaaaaat 12 0 
atcaccggtt ggagggtatg catggactac 18 0 
cactttcctt tgccatttct ggatcagatg 240 
tgcttcttgg atgaatattc tgggtataac 3 00 
aagaccacat tcacttgtcc gtatggcaca 3 60 
tgtaatgcac cagctacatt tcagcggtgt 42 0 
gacatttttg aggtgttcat ggacgatttt 480 
ttgaagaatc ttgatagggt gttggcccat 54 0 
gagaaatgcc actttatggt agaagaagga 60 0 
ggcattgagg tggataaaca aagatagatg 660 
tcaagggagt ccgatgtttt cttgggcatg 72 0 
tctccaaggt t 751 



<210> 45 



50 



<211> 254 
<212> PRT 

<213> Nicotiana tabacum 



<400> 45 

Val Arg Lys Glu Val Val Lys Leu Leu Asp Val Gly Val Val Tyr Pro 
1 5 10 15 

He Ser Asp Ser Ser Trp Thr Ser Pro Val Gin Cys Val Pro Lys Lys 
20 25 30 

Val Gly Met Thr Val Val Lys Asn Ser Lys Asn Glu Leu He Pro Thr 
35 40 45 

Arg Thr He Thr Gly Trp Arg Val Cys Met Asp Tyr Arg Lys Leu Asn 
50 55 60 

Lys Val Thr Cys Lys Asp His Phe Pro Leu Pro Phe Leu Asp Gin Met 
65 70 75 80 

Leu Asp Arg Leu Ala Gly Arg Ala Phe Tyr Cys Phe Leu Asp Glu Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Leu He Ala Pro Glu Asp Pro Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Tyr Gly Thr Phe Val Phe Ser Arg Met Pro Phe 
H5 120 125 

Arg Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Met Ala He 
130 135 140 

Phe Ser Tyr Met Val Lys Asp He Phe Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Val Gly His Ser Phe Asp Glu Cys Leu Lys Asn Leu Asp Arg 
165 170 175 

Val Leu Ala His Cys Glu Glu Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Glu Glu Gly He Asn Leu Trp His Lys He Ser 
195 200 205 

Lys His Gly He Glu Val Asp Lys Ala Lys He Asp Val He Ser Arg 
210 215 220 

Leu Pro Pro Pro Thr Ser Val Lys Gly Val Arg Cys Phe Leu Gly His 
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225 



230 



235 



240 



Ala Gly Phe Tyr Trp Arg Phe He Lys Asp Phe Ser Lys Val 
245 250 



<210> 46 
<211> 762 
<212> DNA 

<213> Nicotiana tabacum 
<400> 46 

gtgcgtaagg aggtgtttaa gttgttggat 
tcttgcattt cgccggtgca atgtgtaccg 
tcgcaaaatg ggttgattcc taccaggatc 
cgaaagttga ataaagtgac ccgcaaggat 
ttagatcgac ttgctgggcg tgccttctac 
caaatcttca ttactccgga agatcaggag 
tttgcttttt ctaggatgcc ttttgggttg 
atgatggcca ttttcactga tatggtggaa 
agtgttgtgg gtgattcatt tgatgaatgt 
tgtaaagaaa ccaatcttgt tcttaattgg 
atagttcttg ggcataaaat tttaaagcat 
gtgatttcaa ggctccctcc ccctacttct 
gcggggttct accggagatt catcaaagat 



gttggggttg tgtaccccat ctctgatagc 60 
aagaagggtg gcatgaccgt ggttgcaaat 12 0 
gtcaccgggt ggaaggtatg catggattac 180 
cactttccat tgccttttct tgatcagatg 240 
tgtttcttgg atgggtattc tggatacaac 3 00 
aagacaacat tcacttgtcc atatggcacc 360 
tgtaatgcac cgactacatt ctagcggtat 42 0 
gatattttgg aggtgttcat ggacgacttt 480 
ttgaataatc ttgatagagt gttggcccat 54 0 
gagaaatgcc acttcatggt tgaggagggc 600 
ggtatagagg tggacaaagc aaaaattgat 660 
gtcaagggag tgagaagttt tcttaggcat 72 0 
ttcaccaaag tt 762 



<210> 47 
<211> 254 
<212> PRT 

<213> Nicotiana tabacum 
<400> 47 

Val Arg Lys Glu Val Phe Lys Leu Leu Asp Val Gly Val Val Tyr Pro 
1 5 10 15 

He Ser Asp Ser Ser Cys He Ser Pro Val Gin Cys Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val Val Ala Asn Ser Gin Asn Gly Leu He Pro Thr 
35 40 45 

Arg He Val Thr Gly Trp Lys Val Cys Met Asp Tyr Arg Lys Leu Asn 
50 55 60 

Lys Val Thr Arg Lys Asp His Phe Pro Leu Pro Phe Leu Asp Gin Met 
65 70 75 80 
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Leu Asp Arg Leu Ala Gly Arg Ala Phe Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 



Ser Gly Tyr Asn Gin He Phe He Thr Pro Glu Asp Gin Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Tyr Gly Thr Phe Ala Phe Ser Arg Met Pro Phe 
H5 120 125 

Gly Leu Cys Asn Ala Pro Thr Thr Phe Glx Arg Tyr Met Met Ala He 
130 135 140 

Phe Thr Asp Met Val Glu Asp He Leu Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Val Gly Asp Ser Phe Asp Glu Cys Leu Asn Asn Leu Asp Arg 
165 170 175 

Val Leu Ala His Cys Lys Glu Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Glu Glu Gly He Val Leu Gly His Lys He Leu 
195 200 205 

Lys His Gly He Glu Val Asp Lys Ala Lys He Asp Val He Ser Arg 
210 215 220 

Leu Pro Pro Pro Thr Ser Val Lys Gly Val Arg Ser Phe Leu Arg His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 48 
<211> 760 
<212> DNA 

<213> Nicotiana tabacum 
<400> 48 

gcggaaggag gtcgtcaagc tgttggatgt 
ttggactttg ccggtgcaat atgtgccgaa 
aaaaaatgag ttgattccta ccaggactgt 
caaattgaat aaagtgaccc gcaaggatca 
agacagactt gctgggtgtg ccttctactg 
aattttgatt gcaccaaaag atcaggagaa 
tgtcttttct aggatgtcat ttgggttgtg 
gatggccata tttacctaca tggtggagga 



cggtgttgtg taccccatat ttgatagctc 60 
gaagggtggt atgaccgtgg ttaccaatgt 120 
caccgggtgg agggtgtgca tggattacca 180 
ctttccatta ccttttcttg atcagatgtt 240 
tttcttggat gggtattctg ggtgcaacaa 3 00 
gaccaccttt acttgtacgt atggtacctt 360 
taatgcaccg actacattct agaggtgtat 420 
cattttggag gtgtttatgg atgacttcag 4 80 
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tgttgttggt gactagtttg atgaatgttt 
tgaagaagcc aaccttgtgc ttaattggga 
agtccttagc cataaaattt caaagcatgg 
gatttcaagg ctccttcccc ctacttctgt 
ggggttctac tggagattca tcaaagactt 



gaaaaatctt gatagagtgt tggcccgttg 54 0 
gaaatgccac ttcatggttg aggagggcat 600 
tatagaggtg gacaaagcaa aaattgaagt 660 
caagggagtt agaagttttc ttgggcatgc 72 0 
cacgaaggtt 760 



<210> 49 
<211> 253 
<212> PRT 

<213> Nicotiana tabacum 
<400> 49 

Arg Lys Glu Val Val Lys Leu Leu Asp Val Gly Val Val Tyr Pro He 
1 5 10 15 

Phe Asp Ser Ser Trp Thr Leu Pro Val Gin Tyr Val Pro Lys Lys Gly 
20 25 30 

Gly Met Thr Val Val Thr Asn Val Lys Asn Glu Leu He Pro Thr Arg 
35 40 45 

Thr Val Thr Gly Trp Arg Val Cys Met Asp Tyr His Lys Leu Asn Lys 
50 55 60 

Val Thr Arg Lys Asp His Phe Pro Leu Pro Phe Leu Asp Gin Met Leu 
65 70 75 80 

Asp Arg Leu Ala Gly Cys Ala Phe Tyr Cys Phe Leu Asp Gly Tyr Ser 
85 90 95 

Gly Cys Asn Lys He Leu He Ala Pro Lys Asp Gin Glu Lys Thr Thr 
100 105 110 

Phe Thr Cys Thr Tyr Gly Thr Phe Val Phe Ser Arg Met Ser Phe Gly 
115 120 125 

Leu Cys Asn Ala Pro Thr Thr Phe Glx Arg Cys Met Met Ala He Phe 
130 135 140 

Thr Tyr Met Val Glu Asp He Leu Glu Val Phe Met Asp Asp Phe Ser 
i45 150 155 160 

Val Val Gly Asp Glx Phe Asp Glu Cys Leu Lys Asn Leu Asp Arg Val 
165 170 175 

Leu Ala Arg Cys Glu Glu Ala Asn Leu Val Leu Asn Trp Glu Lys Cys 
180 185 190 
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His Phe Met Val Glu Glu Gly He 
195 200 

His Gly He Glu Val Asp Lys Ala 
210 215 

Leu Pro Pro Thr Ser Val Lys Gly 
225 230 

Gly Phe Tyr Trp Arg Phe He Lys 
245 



Val Leu Ser His Lys He Ser Lys 
205 

Lys He Glu Val He Ser Arg Leu 
220 

Val Arg Ser Phe Leu Gly His Ala 
235 240 

Asp Phe Thr Lys Val 
250 



<210> 50 
<211> 762 
<212> DNA 

<213> Oryza sativa 
<400> 50 

gtgcgtaagg aggtgtttaa gttcctgtat 
gagtgggtta gcccagttca ggtcgtgcca 
gctcaaaatg aactaatccc gcaacgaacc 
aggaaactta acaaggctac aaaaaaggat 
ttggaacggc tggcaaatca ttccttcttc 
caaattccca tccatccgga ggaccagagt 
tatgcgtatc gtaggatgcc ctttggactg 
atgatgtcta ttttctcgga catgatcgag 
tcggtctatg gaaagacttt gggtcattgt 
tgccaagaaa aggacctagt gcttaactgg 
atagttcttg ggcatcgagt gtccgaacga 
gtgatagatc agcttcctcc acccgtgaac 
gctggctttt atagaaggtt catcaaggac 



gccaggatta tttatctcgt accatacagc 6 0 
aagaagggag gaatgacggc cgttgcaaat 12 0 
gtaaccggat ggagaatgtg catcgattac 180 
catttcccgc tacccttcat tgatgaaatg 240 
tgtttccttg atgggtattc aggatatcat 300 
aagactacgt tcacatgtcc atatggcacc 360 
tgcaacactc ctgcatcttt ccaaaggtgt 420 
gatatcatgg aagtcttcat ggatgacttc 480 
ctgcagaatc tagacaaagt cttacaacga 540 
gaaaagtgcc atttcatggt ctgtgaaggg 600 
ggagtcgaag ttgatcgtgc taaaattgat 660 
atcaaaggaa tccgcagctt ctttggtcac 72 0 
ttcacaaaag tt 762 



<210> 51 
<211> 254 
<212> PRT 

<213> Oryza sativa 
<400> 51 

Val Arg Lys Glu Val Phe Lys Phe 
1 5 

Val Pro Tyr Ser Glu Trp Val Ser 
20 

Gly Gly Met Thr Ala Val Ala Asn 



Leu Tyr Ala Arg He He Tyr Leu 
10 15 

Pro Val Gin Val Val Pro Lys Lys 
25 30 

Ala Gin Asn Glu Leu He Pro Gin 
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35 



40 



45 



Arg Thr Val Thr Gly Trp Arg Met Cys He Asp Tyr Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Lys Lys Asp His Phe Pro Leu Pro Phe He Asp Glu Met 
65 70 75 80 

Leu Glu Arg Leu Ala Asn His Ser Phe Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr His Gin He Pro He His Pro Glu Asp Gin Ser Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Tyr Gly Thr Tyr Ala Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Thr Pro Ala Ser Phe Gin Arg Cys Met Met Ser He 
130 135 140 

Phe Ser Asp Met He Glu Asp He Met Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Lys Thr Leu Gly His Cys Leu Gin Asn Leu Asp Lys 
165 170 175 

Val Leu Gin Arg Cys Gin Glu Lys Asp Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Cys Glu Gly He Val Leu Gly His Arg Val Ser 
1^5 200 205 

Glu Arg Gly Val Glu Val Asp Arg Ala Lys He Asp Val He Asp Gin 
210 215 220 

Leu Pro Pro Pro Val Asn He Lys Gly He Arg Ser Phe Phe Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 52 
<211> 761 
<212> DNA 

<213> Oryza sativa 
<400> 52 
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gtgcgcaagg aggttttgaa attgctgcat 
gagagggtta gcccagtcca ggttgtgcca 
gctcagaatg aactaattac gcaacaaacc 
aggaaactca acaaggctac aaaaaaggat 
ttggaacggc tggcaaatca ttccttcttt 
caaattccca tccatccgga ggactagagt 
tatgcgtatc ataggatgtc ctttggactg 
tgatgtctat tttctcggac atgatcgagg 
cggtctatgg aaagactttc ggtcattgtc 
gccaagaaaa ggacctggtg cttaactggg 
tagttcttgg gcatcgagtg ttcgaacaag 
tgatagatca gcttcctcct cccgtgaaca 
tcggctttta tagaaggttc atcaaggact 



gccaggatta tctatcccgt accatacagt 60 
aagaagggag gaatggcggt cgttgcaaat 12 0 
gtaaccggat ggaggatgtg tatcgattac 18 0 
catttcccgc tacccttcat tgttgaaatg 240 
tgtttccttg atggatattt cggatatcat 3 00 
aagactacgt tcacatgtcc atatggcacc 360 
tgcaacgctc ctgcatcttt ccaaggtgta 42 0 
atatcatgga agtcttcatg gatgacttct 480 
tgcaaaatct agacaaagtc ttacaacgat 540 
aaaagtgaca tttcatggtc cgtgaaggga 6 00 
gaatcgaagt tgatcatgct aaaattgatg 66 0 
tcaaaggtat ccgcagcttc ttgggtcatg 720 
tcactaaagt t 751 



<210> 53 
<211> 254 
<212> PRT 

<213> Oryza sativa 



<400> 53 
Val Arg Lys Glu 
1 

Val Pro Tyr Ser 
20 

Gly Gly Met Ala 
35 

Gin Thr Val Thr 
50 

Lys Ala Thr Lys 
65 

Leu Glu Arg Leu 



Phe Gly Tyr His 
100 

Thr Phe Thr Cys 
115 

Gly Leu Cys Asn 
130 



Val Leu Lys Leu 
5 

Glu Arg Val Ser 



Val Val Ala Asn 
40 

Gly Trp Arg Met 
55 

Lys Asp His Phe 
70 

Ala Asn His Ser 
85 

Gin He Pro He 



Pro Tyr Gly Thr 
120 

Ala Pro Ala Ser 
135 



Leu His Ala Arg 
10 

Pro Val Gin Val 
25 

Ala Gin Asn Glu 



Cys He Asp Tyr 
60 

Pro Leu Pro Phe 
75 

Phe Phe Cys Phe 
90 

His Pro Glu Asp 
105 

Tyr Ala Tyr His 



Phe Gin Arg Cys 
140 



He He Tyr Pro 
15 

Val Pro Lys Lys 
30 

Leu He Thr Gin 
45 

Arg Lys Leu Asn 



He Val Glu Met 
80 

Leu Asp Gly Tyr 
95 

Glx Ser Lys Thr 
110 

Arg Met Ser Phe 
125 

Met Met Ser He 
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Phe Ser Asp Met He Glu Asp He Met Glu Val Phe Met Asp Asp Phe 
14 5 150 155 160 

Ser Val Tyr Gly Lys Thr Phe Gly His Cys Leu Gin Asn Leu Asp Lys 
165 170 175 

Val Leu Gin Arg Cys Gin Glu Lys Asp Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Glx His Phe Met Val Arg Glu Gly He Val Leu Gly His Arg Val Phe 
195 200 205 

Glu Gin Gly He Glu Val Asp His Ala Lys He Asp Val He Asp Gin 
210 215 220 

Leu Pro Pro Pro Val Asn He Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 



Val Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 54 

<211> 762 

<212> DNA 

<213> Oryza sativa 

<400> 54 

gtgcggaaag aggtttttaa gctcctgcat 
9 a 9tgggtca gcacagtcca ggttgggccg 
gctcaaaata aacttatccc gcaaccaacc 
aggaaactca acaaggctac aaaagaggat 
ttggaacgga tgacaaatca ttccttcttc 
caaattccca tccgtccaga ggaccagagt 
tatgcgtatc gtaggatgtc cttcggactg 
atgttgtcta ttttctcgga catgatcgaa 
tcagtttatg gaaagacttt cggtcattgt 
tgccaagaaa atgacctagt gtttaattgg 
atagttcttg ggcatcgagt atccgaatga 
gttatagatc aaattcgtcc tcctgcgaat 
gccggctttt atagaaggtt cctcaaggac 



gccgggatta tttataccgt tccatgcagt 60 
aagatgggat gaatgacggt cgttgcaaat 12 0 
ataaccggat ggaggatgtg catagactac 180 
cattttccgc tacccttcat tgatgaaatg 24 0 
tgtttccttg atgggtattc cggatatcat 3 00 
aagactacgt tcacatgtcc atatggcacc 360 
tgcaacgctc ctgcatcttt ccaaaggtgt 42 0 
gatatcatga aagtcttcat ggatgacttc 480 
ctgtagaatc tagacaaagt cttacaacga 54 0 
gaaaagtgcc attttatggt ccgtgaaggg 600 
ggaatcgaag ttgatcgtgc taaaatcgat 660 
atcaaaggaa tccgcagctt cttgggacat 720 
ttcacaaaag tt 762 



<210> 55 

<211> 254 

<212> PRT 

<213> Oryza sativa 
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<400> 55 

Val Arg Lys Glu Val Phe Lys Leu Leu His Ala Gly He He Tyr Thr 
1 5 10 15 

Val Pro Cys Ser Glu Trp Val Ser Thr Val Gin Val Gly Pro Lys Met 
20 25 30 

Gly Glx Met Thr Val Val Ala Asn Ala Gin Asn Lys Leu He Pro Gin 
35 40 45 

Pro Thr He Thr Gly Trp Arg Met Cys He Asp Tyr Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Lys Glu Asp His Phe Pro Leu Pro Phe He Asp Glu Met 
65 70 75 80 

Leu Glu Arg Met Thr Asn His Ser Phe Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr His Gin He Pro He Arg Pro Glu Asp Gin Ser Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Tyr Gly Thr Tyr Ala Tyr Arg Arg Met Ser Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Ser Phe Gin Arg Cys Met Leu Ser He 
130 135 140 

Phe Ser Asp Met He Glu Asp He Met Lys Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Lys Thr Phe Gly His Cys Leu Glx Asn Leu Asp Lys 
165 170 175 

Val Leu Gin Arg Cys Gin Glu Asn Asp Leu Val Phe Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Arg Glu Gly He Val Leu Gly His Arg Val Ser 
195 200 205 

Glu Glx Gly He Glu Val Asp Arg Ala Lys He Asp Val He Asp Gin 
210 215 220 

He Arg Pro Pro Ala Asn He Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 



Ala Gly Phe Tyr Arg Arg Phe Leu Lys Asp Phe Thr Lys Val 
245 250 



59 



<210> 56 
<211> 762 
<212> DNA 

<213> Oryza sativa 
<400> 56 

gtgcgtaagg aggtcttgaa gctcttgcat 
gagtgggtta gcccggtcta ggttatgccg 
gctcaaaatg aacttattcc gcaacgaaca 
atgaaactta acaaggctac gaaaaaggat 
ttggaacggc tggcaaatca ttctttcttc 
caaattccca tccatccgga ggaccaaagt 
tatgcttatc gtaggatgtc cttcggactg 
atgatgtcta ttttctccga catgattaag 
tctatttatg gaaagacctc cggtcattgt 
tgccaagaga aggacctggt acttaattgg 
atagttctta gtcatcgagt gtccgaataa 
gtaatagatt agcttccttc tcctgtgaac 
gctggctttt atagaaggtt catcaaagac 



gccgagatta tttatcccgt accatataga 60 
aagaagggac gaatgacggt cattgcaaat 12 0 
gtaaccggat ggaggatgtg catagattac 180 
catttcccac tacccttcat tgatgaaatg 24 0 
cgtttccttg atgggtattc taggtatgat 3 00 
aagactacgt tcacatgttc gtatgatacc 360 
tgcaacgctc ctgcatcttt ccaaaggtgt 420 
gacattatgg aagtcttcat gcatgacttc 480 
ctacaaaatt tagacaaaat tttgcaacga 540 
gaaaagtgtc atttcatggt ccgtgaaggg 600 
ggaatcgaag ttgatcgtgc taaaaactat 660 
attaagggga tccgcaattt tttgggacat 72 0 
ttcacaaagg tt 762 



<210> 57 

<211> 254 

<212> PRT 

<213> Oryza sativa 

<400> 57 

Val Arg Lys Glu Val Leu Lys Leu Leu His Ala Glu He He Tyr Pro 
1 5 10 15 

Val Pro Tyr Arg Glu Trp Val Ser Pro Val Glx Val Met Pro Lys Lys 
20 25 30 

Gly Arg Met Thr Val He Ala Asn Ala Gin Asn Glu Leu He Pro Gin 
35 40 45 

Arg Thr Val Thr Gly Trp Arg Met Cys He Asp Tyr Met Lys Leu Asn 
50 55 60 

Lys Ala Thr Lys Lys Asp His Phe Pro Leu Pro Phe He Asp Glu Met 
65 70 75 80 

Leu Glu Arg Leu Ala Asn His Ser Phe Phe Arg Phe Leu Asp Gly Tyr 
85 90 95 

Ser Arg Tyr Asp Gin He Pro He His Pro Glu Asp Gin Ser Lys Thr 



60 



100 



105 



110 



Thr Phe Thr Cys Ser Tyr Asp Thr Tyr Ala Tyr Arg Arg Met Ser Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Ser Phe Gin Arg Cys Met Met Ser He 
130 135 140 

Phe Ser Asp Met He Lys Asp He Met Glu Val Phe Met His Asp Phe 
145 150 155 160 

Ser He Tyr Gly Lys Thr Ser Gly His Cys Leu Gin Asn Leu Asp Lys 
165 170 175 

He Leu Gin Arg Cys Gin Glu Lys Asp Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Arg Glu Gly He Val Leu Ser His Arg Val Ser 
195 200 205 

Glu Glx Gly He Glu Val Asp Arg Ala Lys Asn Tyr Val He Asp Glx 
210 215 220 

Leu Pro Ser Pro Val Asn He Lys Gly He Arg Asn Phe Leu Gly His 
225 230 235 240 



Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 58 
<211> 762 
<212> DNA 

<213> Hordeum vulgare 
<400> 58 

gtgcgcaagg aggtttagaa gttcctggaa gcaggtatca tctatcgtgt tgctcatagt 60 
gattggttga gtcgggtgca ttgtgtccct aagaagggag gcattaccgt tgtccctaat 12 0 
gataaggatg aattgatccc acagaggact attactggct ataggatggt gattgatttt 18 0 
aggaaattga ataaagccac taggaaagat cattaccctt tgccttttat cgaccaaatg 240 
cgagaaaggc tgtctaaaca cacacacttc tgctttctaa acggttattt tggtttctcc 3 00 
caaataccag ttgcacaatc tgatcaggag aaaaccactt tcacctgccc ttttggtaca 360 
tttgcttata gacgtatgac ttttggctta tgtaatgcac ctgcctcctt tcaaagatgt 420 
atgatggcta tattccctga cttttgtgaa aagattgttg aggttttcat ggatgacttc 480 
tccatttacg gatcttcctt tgatgattgc ctcagcaacc ttgatcgagt cttgcagaga 540 
tgtaaagaca ccaatctttt cttgaattgg aagaagtgcc actttatggt taatgacggc 600 
atcgtcttag gacataaatt ttctgaaaga ggtattgaag tcgataaggc taaggttgat 66 0 
ggaatcgaga aaatgccata ccccacagat atcaaaggga taagaagttt ccttggtcat 72 0 



61 



gctggtttct atagaaggtt cataaaagac ttcactaagg tt 



762 



<210> 59 
<211> 254 
<212> PRT 

<213> Hordeum vulgare 



<400> 59 

Val Arg Lys Glu Val Glx Lys Phe Leu Glu Ala Gly He He Tyr Arg 
15 10 15 

Val Ala His Ser Asp Trp Leu Ser Arg Val His Cys Val Pro Lys Lys 
20 25 30 

Gly Gly He Thr Val Val Pro Asn Asp Lys Asp Glu Leu He Pro Gin 
35 40 45 

Arg Thr He Thr Gly Tyr Arg Met Val He Asp Phe Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Arg Lys Asp His Tyr Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Arg Glu Arg Leu Ser Lys His Thr His Phe Cys Phe Leu Asn Gly Tyr 
85 90 95 

Phe Gly Phe Ser Gin He Pro Val Ala Gin Ser Asp Gin Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Phe Gly Thr Phe Ala Tyr Arg Arg Met Thr Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Ser Phe Gin Arg Cys Met Met Ala He 
130 135 140 

Phe Pro Asp Phe Cys Glu Lys He Val Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser He Tyr Gly Ser Ser Phe Asp Asp Cys Leu Ser Asn Leu Asp Arg 
165 170 175 

Val Leu Gin Arg Cys Lys Asp Thr Asn Leu Phe Leu Asn Trp Lys Lys 
180 185 190 

Cys His Phe Met Val Asn Asp Gly He Val Leu Gly His Lys Phe Ser 
195 200 205 



62 



Glu Arg Gly He Glu Val Asp Lys Ala Lys Val Asp Gly He Glu Lys 
210 215 220 



Met Pro Tyr Pro Thr Asp He Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 60 
<211> 762 
<212> DNA 

<213> Hordeum vulgare 
<400> 60 

gtgcgtaaag aggtcctaaa gttcctggaa 
gattgggtga gtccggtgca ttgcgtccct 
gataaggatg aattgatccc acataggatt 
aggaaaatga ataaagccac taggaaagaa 
ctagaaaggt tgtctaaaca cacacacttc 
caaatactag ttgcacaatc tgatcaggag 
tttgcttata gacgtatgcc ttttggctta 
atgatggcta tattctctga cttttgtgaa 
tccgtttacg gatcttcctt tgatgattgc 
tgtaaagata ctaatcttgt cttgaattgg 
atcgtcttag gacataaaat ttccgaaaga 
gcaatcaaga aaatgccata ccccacagat 
gctggtttct atagaaggtt catcaaggac 



gcgggtatta tctatcctgt tgctcacaac 60 
aagaagggat gcattaccgt tgtccctaat 12 0 
attactggct ataggatggt gatcgatttt 180 
cattaccctt tgccttttag cgaccaaatg 240 
tgctttctag acggttattc tagtttctcc 300 
aaaaccactt tcacctaccc gttcggtacc 360 
tgtaatgcac ctgccacctt tcaaagatgt 420 
aagtttgtcg aggttttcat ggatgacttt 480 
ctcaacaacc ttgatcgggt cttgcagaga 54 0 
gagaagtgcc actttatggt taatgaaggc 600 
ggtattgaat tcgataaggc taaggttggt 660 
atcaaaggta taagaagttt cttggtccat 72 0 
tttacaaagg tt 762 



<210> 61 

<211> 254 

<212> PRT 

<213> Hordeum vulgare 



<400> 61 
Val Arg Lys Glu 
1 

Val Ala His Asn 
20 

Gly Cys He Thr 
35 

Arg He He Thr 
50 



Val Leu Lys Phe 
5 

Asp Trp Val Ser 



Val Val Pro Asn 
40 

Gly Tyr Arg Met 
55 



Leu Glu Ala Gly 
10 

Pro Val His Cys 
25 

Asp Lys Asp Glu 



Val He Asp Phe 
60 



He He Tyr Pro 
15 

Val Pro Lys Lys 
30 

Leu He Pro His 
45 

Arg Lys Met Asn 



63 



Lys Ala Thr Arg Lys Glu His Tyr Pro Leu Pro Phe Ser Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ser Lys His Thr His Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Ser Phe Ser Gin He Leu Val Ala Gin Ser Asp Gin Glu Lys Thr 
100 105 110 

Thr Phe Thr Tyr Pro Phe Gly Thr Phe Ala Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Met Ala He 
130 135 140 

Phe Ser Asp Phe Cys Glu Lys Phe Val Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Ser Ser Phe Asp Asp Cys Leu Asn Asn Leu Asp Arg 
165 170 175 

Val Leu Gin Arg Cys Lys Asp Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Asn Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

Glu Arg Gly He Glu Phe Asp Lys Ala Lys Val Gly Ala He Lys Lys 
210 215 220 

Met Pro Tyr Pro Thr Asp He Lys Gly He Arg Ser Phe Leu Val His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 62 
<211> 757 
<212> DNA 

<213> Hordeura vulgare 
<400> 62 

gaaaagaggt tgtgaagctc ctggatgaag 
gggtgagtcc ggtgcatagc gttcctaaga 
aggatgaatt gatcccgcag aggattatca 
aactgaataa agccactagg aaagatcatt 



gtattatcta tcatgttgct catagcgatt 60 
agggaggcat taccgttgtc cctaatgata 12 0 
ctggctatag gatggtgatc gatttcagga 180 
accctttgcc ttttatcgac catatgctag 240 



64 



aaaggttgtc caaactcaca cacttctgct 
taccagttgc acaatctgat caggagaaaa 
cttatagacg tatgcctttt ggcttatgta 
tggctatatt ctctaacttt tgtgaaaata 
tttacgggtc ttcttttgat gattgcctca 
aagacaccaa tcttgtcttg aatggggaga 
tcttaggaca taaaatttct gaaagaggta 
tcgacaaaat gccatacccc acagatatca 
gtttctatag aaggtttatc aaagatttca 



ttctagacgg ttattctagt ttctcccaaa 300 
ccactttcac ctgccctttc ggtacctttg 360 
atgcacctgc cacctttcaa agatgtatga 42 0 
ttgtcgaggt tttcatggat gacttttccg 480 
gcaaccttga tcgagtctta cagagatgta 54 0 
agtgccactt tatggttaat gaaggcatcg 600 
ttgaagtcga taaggctaag gttgatgcaa 660 
aaggtataag aagtttcctt ggtcatggtg 72 0 
caaaggt 757 



<210> 63 
<211> 251 
<212> PRT 

<213> Hordeum vulgare 
<400> 63 

Lys Glu Val Val Lys Leu Leu Asp Glu Gly He He Tyr His Val Ala 
15 10 15 

His Ser Asp Trp Val Ser Pro Val His Ser Val Pro Lys Lys Gly Gly 
20 25 30 

He Thr Val Val Pro Asn Asp Lys Asp Glu Leu He Pro Gin Arg He 
35 40 45 

He Thr Gly Tyr Arg Met Val He Asp Phe Arg Lys Leu Asn Lys Ala 
50 55 60 

Thr Arg Lys Asp His Tyr Pro Leu Pro Phe He Asp His Met Leu Glu 
65 70 75 80 

Arg Leu Ser Lys Leu Thr His Phe Cys Phe Leu Asp Gly Tyr Ser Ser 
85 90 95 

Phe Ser Gin He Pro Val Ala Gin Ser Asp Gin Glu Lys Thr Thr Phe 
100 105 110 

Thr Cys Pro Phe Gly Thr Phe Ala Tyr Arg Arg Met Pro Phe Gly Leu 
115 120 125 

Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Met Ala He Phe Ser 
130 135 140 

Asn Phe Cys Glu Asn He Val Glu Val Phe Met Asp Asp Phe Ser Val 
145 150 155 160 

Tyr Gly Ser Ser Phe Asp Asp Cys Leu Ser Asn Leu Asp Arg Val Leu 



65 



165 



170 



175 



Gin Arg Cys Lys Asp 
180 

Phe Met Val Asn Glu 
195 

Gly lie Glu Val Asp 
210 

Tyr Pro Thr Asp lie 
225 

Phe Tyr Arg Arg Phe 
245 



Thr Asn Leu Val Leu Asn 
185 

Gly He Val Leu Gly His 
200 

Lys Ala Lys Val Asp Ala 
215 

Lys Gly He Arg Ser Phe 
230 235 

He Lys Asp Phe Thr Lys 
250 



Gly Glu Lys Cys His 
190 

Lys He Ser Glu Arg 
205 

He Asp Lys Met Pro 
220 

Leu Gly His Gly Gly 
240 



<210> 64 
<211> 740 
<212> DNA 

<213> Hordeum vulgare 
<400> 64 

gtgcgtaaag aggtgattaa attcctagaa 
gattgggtga gtccggtgca ttgcattcct 
gataaggatg aattgatccc atagaggatt 
aggaagttga ataaagccac taggaaagat 
ctagaaaggc tgtctaaaca cacacacttc 
caaataccag ttgcacaatt tgatcaggag 
tttgcttata tacgtatgcc ttttggcttg 
atgatggcta tattctccga cttctgtgaa 
tccgtttacg ggtgttcctt tgatgattgc 
tgtaaggaca ccaatgttgt cttgaattgg 
atcgtcttag gacataagat ttctgaaaga 
gcaatcgaga aaatgccata tccacagata 
ctggtttcta tagaaggttc 



gaaggtatta tctatcctgt tgctcacagc 60 
aagaaaggag gcattaccgt tgtccctaat 12 0 
attactggct ataggatggt gattgatttt 18 0 
cattaccctt tgccttttat cgaccaaatg 240 
ttgtttctgg acggttatac tggtttctcc 3 00 
aaaaccactt taacctgaca tttcggtacc 3 60 
tgtaatgcac ctgccacctt tcaaagatgt 42 0 
aagattgtca atgttttcat ggataacttc 480 
ctcaacaacg ttgatcgagt cttacagaga 54 0 
gagaagtgtc actttatggt taatgaaggc 600 
ggtattaaag ttgataaggc taaggttgat 66 0 
tcaaaggtat aagaagtttc cttggtcatg 72 0 

740 



<210> 65 
<211> 247 
<212> PRT 

<213> Hordeum vulgare 
<400> 65 

Val Arg Lys Glu Val He Lys Phe Leu Glu Glu Gly He He Tyr Pro 
15 10 15 



66 



Val Ala His Ser Asp Trp Val Ser Pro Val His Cys He Pro Lys Lys 
20 25 30 



Gly Gly He Thr Val Val Pro Asn Asp Lys Asp Glu Leu He Pro Glx 
35 40 45 

Arg He He Thr Gly Tyr Arg Met Val He Asp Phe Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Arg Lys Asp His Tyr Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ser Lys His Thr His Phe Leu Phe Leu Asp Gly Tyr 
85 90 95 

Thr Gly Phe Ser Gin lie Pro Val Ala Gin Phe Asp Gin Glu Lys Thr 
100 105 110 

Thr Leu Thr Glx His Phe Gly Thr Phe Ala Tyr He Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Met Ala He 
130 135 140 

Phe Ser Asp Phe Cys Glu Lys He Val Asn Val Phe Met Asp Asn Phe 
145 150 155 160 

Ser Val Tyr Gly Cys Ser Phe Asp Asp Cys Leu Asn Asn Val Asp Arg 
165 170 175 

Val Leu Gin Arg Cys Lys Asp Thr Asn Val Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Asn Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

Glu Arg Gly He Lys Val Asp Lys Ala Lys Val Asp Ala He Glu Lys 
210 215 220 

Met Pro Tyr Pro Thr Asp He Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 



Ala Gly Phe Tyr Arg Arg Phe 
245 



<210> 66 
<211> 762 



67 



<212> DNA 

<213> Avena sativa 
<400> 66 

gtgcgaaagg aggttttcaa gctcatggat 
gaatgggtta gtcatgttca ttgtgttcct 
gataatgatg agcttattcc tcaaagaata 
aggaaagtca ataaagttac taagaaagat 
ttggaaagat tttctaaaaa gacccatttt 
caaattgttg ttaaacaaca agatcaagaa 
tatgcttata gatgtatgcc ttttggttta 
atgtctgcta tctttcatgg tttttgtgag 
tctgtctacg gaacttcttt tgataattgt 
tgtgaaggaa ctaatcttgt tcttaattgg 
attgttcttg ggcataaagt ttctaaaaga 
gcaattgaga agatgccatg tccaagagac 
gctggtttct ataggaggtt catcaaagac 



gctggtatta tttaccctat tgctgatagt 60 
aaaaagggag gtattaccgt tgtccctaat 12 0 
gtggtaggct ataggatgtg catcgatttt 180 
cactacccgc ttccttttat tgatcaaatg 240 
tgttttcttg atggttattc tggtttctct 300 
aaaactactt ttacttgccc ttatggaact 360 
tgtaatgctc cttctacttt cctaaggtgc 420 
gaaattgtag aagtgttcat ggacgacttt 480 
ctgcacaacc ttgataaagt tttacagaga 540 
gagaaatgcc acttcatggt taatgaaggg 600 
ggcatagaag ttgatagagc taaggttgag 660 
atcaaaggta ttcgtagtat ccttggtcat 720 
ttcacaaagg tt 762 



<210> 67 
<211> 254 
<212> PRT 

<213> Avena sativa 



<400> 67 
Val Arg Lys Glu 
1 

lie Ala Asp Ser 
20 

Gly Gly lie Thr 
35 

Arg He Val Val 
50 

Lys Val Thr Lys 
65 

Leu Glu Arg Phe 



Ser Gly Phe Ser 
100 

Thr Phe Thr Cys 
115 



Val Phe Lys Leu 
5 

Glu Trp Val Ser 



Val Val Pro Asn 
40 

Gly Tyr Arg Met 
55 

Lys Asp His Tyr 
70 

Ser Lys Lys Thr 
85 

Gin He Val Val 



Pro Tyr Gly Thr 
120 



Met Asp Ala Gly 
10 

His Val His Cys 
25 

Asp Asn Asp Glu 



Cys He Asp Phe 
60 

Pro Leu Pro Phe 
75 

His Phe Cys Phe 
90 

Lys Gin Gin Asp 
105 

Tyr Ala Tyr Arg 



He He Tyr Pro 
15 

Val Pro Lys Lys 
30 

Leu He Pro Gin 
45 

Arg Lys Val Asn 



He Asp Gin Met 
80 

Leu Asp Gly Tyr 
95 

Gin Glu Lys Thr 
110 

Cys Met Pro Phe 
125 



68 



Gly Leu Cys Asn Ala Pro Ser Thr Phe Leu Arg Cys Met Ser Ala He 
130 135 140 



Phe His Gly Phe Cys Glu Glu He Val Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Thr Ser Phe Asp Asn Cys Leu His Asn Leu Asp Lys 
165 170 175 

Val Leu Gin Arg Cys Glu Gly Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Asn Glu Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Lys Arg Gly He Glu Val Asp Arg Ala Lys Val Glu Ala He Glu Lys 
210 215 220 

Met Pro Cys Pro Arg Asp He Lys Gly He Arg Ser He Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 68 
<211> 762 
<212> DNA 

<213> Avena sativa 
<400> 68 

gtgcgcaaag aggtctttaa gttccttgat 
caatgggtta gccttgttca ttgtgtcccc 
gaagataatg agcttatacc ccaaagagta 
agaaggatta ataaagttac taggaaagat 
cttgagaggt tgtccaaaaa gactcacttt 
caaattgttg tgaaagcaca agaccaagag 
tatgattata ggcgtatgcc ttttggttta 
atgtctgcta tatttcatgg tttttgtgaa 
tctgtctatg gaacttcttt tgataactgt 
tttgaagaaa ccaaccttgt tcttaattgg 
attgttcttg gacacaagat ctcagaaaga 
gcaattgaga acatgccttg ccctagagat 
gctggtttct atagtaggtt catcaaagac 



gctggtatta tttaccctat tgctgatagt 6 0 
aagaaagggg gaataactgt tgtgcctaat 12 0 
gtggttgtgt atagaatgtg cattgatttt 180 
cattatcctt tgccctttat tgatcaaatg 240 
tgttttcttg atggtcattc tgggttttct 300 
aaaactactt tcacttgtcc ttatggtact 360 
tgtaatgctc ctgctacctt tcagagatgt 42 0 
gaaattgtgg aggttttcat ggacgatttt 48 0 
ttgcacaacc ttgataaatt tttgcagaga 54 0 
gagaaatgcc atttcatggt taatgaaggg 600 
ggcattgaag ttgacagagc caaaattgaa 660 
attaaaggta ttcgtagtat ccttggtcat 72 0 
tttacaaaag tt 762 



<210> 69 



69 



<211> 254 
<212> PRT 

<213> Avena sativa 



<400> 69 

Val Arg Lys Glu Val Phe Lys Phe Leu Asp Ala Gly He He Tyr Pro 
15 10 15 

He Ala Asp Ser Gin Trp Val Ser Leu Val His Cys Val Pro Lys Lys 
20 25 30 

Gly Gly He Thr Val Val Pro Asn Glu Asp Asn Glu Leu He Pro Gin 
35 40 45 

Arg Val Val Val Val Tyr Arg Met Cys He Asp Phe Arg Arg He Asn 
50 55 60 

Lys Val Thr Arg Lys Asp His Tyr Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ser Lys Lys Thr His Phe Cys Phe Leu Asp Gly His 
85 90 95 

Ser Gly Phe Ser Gin He Val Val Lys Ala Gin Asp Gin Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Tyr Gly Thr Tyr Asp Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Ser Ala He 
130 135 140 

Phe His Gly Phe Cys Glu Glu He Val Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Thr Ser Phe Asp Asn Cys Leu His Asn Leu Asp Lys 
165 170 175 

Phe Leu Gin Arg Phe Glu Glu Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Asn Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

Glu Arg Gly He Glu Val Asp Arg Ala Lys He Glu Ala He Glu Asn 
210 215 220 

Met Pro Cys Pro Arg Asp He Lys Gly He Arg Ser He Leu Gly His 



70 



225 



230 



235 



240 



Ala Gly Phe Tyr Ser Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 70 

<211> 756 

<212> DNA 

<213> Avena sativa 



<400> 70 

aaggaggttt ttaaactcct 
gttagtcttg ttcattgtgt 
gatgagctta ttcctcaaag 
gttaataaag ttactaagaa 
aggttgtcta aaaagaccca 
gctgttaaac aacaagatca 
tatagacgta tgcctattgg 
gctatatttc atggtttttg 
tatggaactt cttttgataa 
gaaactaata ttgttcttaa 
cttgggcata aagtttctaa 
gagaagatgc catgcccaag 
ttctatagaa ggtttatcaa 



tgatgttggt attatttacc 
tcctaaaaag ggaggtatta 
aatagtggta ggctatagga 
agatcactac ccgcttcctt 
tttttgtttt cttgatggtt 
agaaaaaact acttttactt 
tttatgtaat gctcctgcta 
tgaggaaatt gtagaagtgt 
ttgcctgcac aaccttgata 
ttgggagaaa ttccacttca 
aagaggcata gaagttgata 
agacatcaaa ggtatacgta 
agacttcaca aaggtt 



ctattgctga tagtgaatgg 60 
ccgttgttcc taatgataat 120 
tgtgcataga ttttaggaaa 180 
ttattgatca aatgttggaa 240 
actctagctt ctctcaaatt 300 
gcccttatgg aacttttgct 360 
cttttcaaag gtgtatgtct 42 0 
tcatggatga cttttctgtc 480 
aagttttgca gagatgtgaa 540 
tggttaatga agggattgtc 60 0 
gagctaaggt tgaggcaatt 660 
gtatccttgg tcatgctggt 72 0 

756 



<210> 71 
<211> 252 
<212> PRT 

<213> Avena sativa 



<400> 71 
Lys Glu Val Phe 
1 

Asp Ser Glu Trp 
20 

He Thr Val Val 
35 

Val Val Gly Tyr 
50 

Thr Lys Lys Asp 
65 



Lys Leu Leu Asp 
5 

Val Ser Leu Val 



Pro Asn Asp Asn 
40 

Arg Met Cys He 
55 

His Tyr Pro Leu 
70 



Val Gly He He 
10 

His Cys Val Pro 
25 

Asp Glu Leu He 



Asp Phe Arg Lys 
60 

Pro Phe He Asp 
75 



Tyr Pro He Ala 
15 

Lys Lys Gly Gly 
30 

Pro Gin Arg He 
45 

Val Asn Lys Val 



Gin Met Leu Glu 
80 



71 



Arg Leu Ser Lys Lys Thr His Phe Cys Phe Leu Asp Gly Tyr Ser Ser 
85 90 95 



Phe Ser Gin He Ala Val Lys Gin Gin Asp Gin Glu Lys Thr Thr Phe 
100 105 110 

Thr Cys Pro Tyr Gly Thr Phe Ala Tyr Arg Arg Met Pro He Gly Leu 
115 120 125 

Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Ser Ala He Phe His 
130 135 140 

Gly Phe Cys Glu Glu He Val Glu Val Phe Met Asp Asp Phe Ser Val 
145 150 155 160 

Tyr Gly Thr Ser Phe Asp Asn Cys Leu His Asn Leu Asp Lys Val Leu 
165 170 175 

Gin Arg Cys Glu Glu Thr Asn He Val Leu Asn Trp Glu Lys Phe His 
180 185 190 

Phe Met Val Asn Glu Gly He Val Leu Gly His Lys Val Ser Lys Arg 
195 200 205 

Gly He Glu Val Asp Arg Ala Lys Val Glu Ala He Glu Lys Met Pro 
210 215 220 

Cys Pro Arg Asp He Lys Gly He Arg Ser He Leu Gly His Ala Gly 
225 230 235 240 

Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 72 
<211> 748 
<212> DNA 

<213> Secale cereale 
<400> 72 

gtgcggaaag aggtctttaa actcctagag gcaggtatta actatcccat tgctgatagc 60 
cagcgggtaa gtcatgtcca ttgtgttcct aagaaaggag gtatgactgt cgtccctaag 120 
gataaagatg aatttatccc gcaaagaata gttacaggtt ataggatggt aattgatttt 180 
cgtaagttaa ataaagctac tatgaaagat cattacccct tgccatttat tgatcaaatg 240 
ccagacaggt tatccaaaca tactcatttc tgctttctag atggttattc tggtttctct 3 00 
caaatacctt tgtcaaaggg ggatcaagaa aagaccacct ttacttgtcc tttcggtacc 360 
tttgcttata gaggtatgcc ttttggttta tgtaatgcac ctgctacctt tcaaagatgt 420 
atgatcgtta tattctctgt cttttttgaa aagattgttg aggtattcat ggatgatttc 4 80 

72 



tccgtttatg gaacttcttt 
tgtgaagata ctaaccttgt 
attttcttgg gacataaaat 
gctattgaaa agatgccatg 
gctgggtttt ataggaggtt 



tgatgattgc ttaagcaacc 
cttgaattgg gagaagtgcc 
ttctgaaaga ggtactgaag 
ccctaaggat atgaaaggta 
cataaaag 



ttgatcgagt tttgcagaga 54 0 
actttatggt taatgaaggc 60 0 
ttgagaaagc taaagtggat 660 
tacgaagttt ccttggtcac 72 0 

748 



<210> 73 
<211> 249 
<212> PRT 

<213> Secale cereale 
<400> 73 

Val Arg Lys Glu Val Phe Lys Leu Leu Glu Ala Gly He Asn Tyr Pro 
15 10 15 

He Ala Asp Ser Gin Arg Val Ser His Val His Cys Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val Val Pro Lys Asp Lys Asp Glu Phe He Pro Gin 
35 40 45 

Arg He Val Thr Gly Tyr Arg Met Val He Asp Phe Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Met Lys Asp His Tyr Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Pro Asp Arg Leu Ser Lys His Thr His Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Phe Ser Gin He Pro Leu Ser Lys Gly Asp Gin Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Phe Gly Thr Phe Ala Tyr Arg Gly Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met He Val He 
130 135 140 

Phe Ser Val Phe Phe Glu Lys He Val Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Thr Ser Phe Asp Asp Cys Leu Ser Asn Leu Asp Arg 
165 170 175 

Val Leu Gin Arg Cys Glu Asp Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 



73 



Cys His Phe Met Val Asn Glu Gly lie Phe Leu Gly His Lys He Ser 
195 200 205 



Glu Arg Gly Thr Glu Val Glu Lys Ala Lys Val Asp Ala He Glu Lys 
210 215 220 

Met Pro Cys Pro Lys Asp Met Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys 
245 



<210> 74 
<211> 762 
<212> DNA 

<213> Secale cereale 
<400> 74 

gtgcggaagg aggtcgttaa gcttccagag gcaggtatta tctatcccgt tgctgatagc 60 

cagtgggtaa gtcatgtcca ttgtgtccct aagaagggag gtatgactgt cgttcctaat 120 

gacaaacatg aattgatccc gcaaagaata gttacaggtt ataggatggt aattgatttc 180 

cgtaagttaa ataaagctac taagaaagat cattacccct tgccatttat tgatcaaatg 24 0 

ctagacaggt tatccaaaca tactcatttt tgctttctag atggttatta tggtttctct 300 

caaatacctg tgtcaaaagg ggatcaagaa aagaccactt tcacttgtcc tttcggtacc 360 

tttgcttata gacgtatgcc ttttggttta tgtaatgcac ctgctacctt tcaaagatgt 42 0 

atgatggcta tattatctga tttttgagaa aagattgttg aggttttcat ggatgatttc 480 

tccgtttacg gaacttcttt tgatgactac ttaagcaaca atgatcgagt tttgcagaga 540 

tgtgaagaca ctaatcttgt tttgaattgg gagaagtgcc actttatggt taatgaaggc 600 

attgtcttgg gacaaaaaat ttctgaaaga ggtattgaag ttgacaaagc taaagtcgat 660 

gctgttgaaa agatgccatg ccccaaggac atcaaaggta tacgaagttt ccttggtcat 72 0 

gttgggtttt ataggaggtt catcaaagac ttcacgaaag tt 762 

<210> 75 
<211> 254 
<212> PRT 

<213> Secale cereale 
<400> 75 

Val Arg Lys Glu Val Val Lys Leu Pro Glu Ala Gly He He Tyr Pro 
15 10 15 

Val Ala Asp Ser Gin Trp Val Ser His Val His Cys Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val Val Pro Asn Asp Lys His Glu Leu He Pro Gin 



74 



35 



40 



45 



Arg He Val Thr Gly Tyr Arg Met Val He Asp Phe Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Lys Lys Asp His Tyr Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Asp Arg Leu Ser Lys His Thr His Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 

Tyr Gly Phe Ser Gin He Pro Val Ser Lys Gly Asp Gin Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Phe Gly Thr Phe Ala Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Met Ala He 
130 135 140 

Leu Ser Asp Phe Glx Glu Lys He Val Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Thr Ser Phe Asp Asp Tyr Leu Ser Asn Asn Asp Arg 
165 170 175 

Val Leu Gin Arg Cys Glu Asp Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Asn Glu Gly He Val Leu Gly Gin Lys He Ser 
195 200 205 

Glu Arg Gly He Glu Val Asp Lys Ala Lys Val Asp Ala Val Glu Lys 
210 215 220 

Met Pro Cys Pro Lys Asp He Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Val Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 76 
<211> 762 
<212> DNA 

<213> Secale cereale 
<400> 76 



75 



gtgcgtaagg aggtggttaa gctcctagaa gcaggtatta tctatccagt tgctgatagt 60 
cagtgggtaa gtcatgtcca ttatgttcct aagaaaggag gtatgactgt tgtccctaat 12 0 
gataaagatg aattgatccc gcaaagaata gttacaggtt ataggatggt aagtgatttc 180 
cgtaagttga ataaagccac taagaaagat cattacccct tgccatttat tgatcaaatg 24 0 
ctagaaaggt tatccaaaca tactcatttc ttctttctag atggttattc tggtttctct 300 
caaatacctg tgtcaaaagg ggatcaagaa aagaccacct ttacttgtac tttcggtacc 360 
tttgcttata gacgtatgcc ttttggttta tgtaatgcac ctgctacctt tcaaagatgc 42 0 
atgatggcta tattctctga cttttgtgaa aagattgttg aggtattcat ggatgatttc 480 
tccgtttacg gaacttcttt tgatgattgc ttaagcaacc ttgatcgagt tttgcagaga 540 
tgtgaagaca ctaaccttgt cttgaattgc gagaagtgcc actttatggt taatgaaggc 600 
attgtcttgg gacataaaat ttctgaaata ggtattgaag ttgacaaagc taaagttgat 660 
gctattgaaa agatgccatg cgcaaaggac atcaaaggta tacggagttt ccttggtcat 72 0 
gccgggtttt ataggaggtt catcaaagat ttctcaaagg tt 762 



<210> 77 
<211> 254 
<212> PRT 

<213> Secale cereale 



<400> 77 

Val Arg Lys Glu 

1 

Val Ala Asp Ser 
20 

Gly Gly Met Thr 
35 

Arg lie Val Thr 
50 

Lys Ala Thr Lys 
65 

Leu Glu Arg Leu 



Ser Gly Phe Ser 
100 

Thr Phe Thr Cys 
115 

Gly Leu Cys Asn 
130 



Val Val Lys Leu 
5 

Gin Trp Val Ser 



Val Val Pro Asn 
40 

Gly Tyr Arg Met 
55 

Lys Asp His Tyr 
70 

Ser Lys His Thr 
85 

Gin He Pro Val 



Thr Phe Gly Thr 
120 

Ala Pro Ala Thr 
135 



Leu Glu Ala Gly 

10 

His Val His Tyr 
25 

Asp Lys Asp Glu 



Val Ser Asp Phe 
60 

Pro Leu Pro Phe 
75 

His Phe Phe Phe 
90 

Ser Lys Gly Asp 
105 

Phe Ala Tyr Arg 



Phe Gin Arg Cys 
140 



He He Tyr Pro 
15 

Val Pro Lys Lys 
30 

Leu He Pro Gin 
45 

Arg Lys Leu Asn 



He Asp Gin Met 
80 

Leu Asp Gly Tyr 
95 

Gin Glu Lys Thr 
110 

Arg Met Pro Phe 
125 

Met Met Ala He 



76 



Phe Ser Asp Phe Cys Glu Lys He Val Glu Val Phe Met Asp Asp Phe 
145 150 155 160 



Ser Val Tyr Gly Thr Ser Phe Asp 
165 

Val Leu Gin Arg Cys Glu Asp Thr 
180 

Cys His Phe Met Val Asn Glu Gly 
195 200 

Glu He Gly He Glu Val Asp Lys 
210 215 

Met Pro Cys Ala Lys Asp He Lys 
225 230 

Ala Gly Phe Tyr Arg Arg Phe He 
245 



Asp Cys Leu Ser Asn Leu Asp Arg 
170 175 

Asn Leu Val Leu Asn Cys Glu Lys 
185 190 

He Val Leu Gly His Lys He Ser 
205 

Ala Lys Val Asp Ala He Glu Lys 
220 

Gly He Arg Ser Phe Leu Gly His 
235 240 

Lys Asp Phe Ser Lys Val 
250 



<210> 78 
<211> 759 
<212> DNA 

<213> Secale cereale 
<400> 78 

gtgcgcaagg aagtttttaa gtttctagag 
cagtgggtaa gtcctgtcca ttgtgtccct 
gataaagatg aattgatctc gcaaagaatt 
cgcaaattaa ataaagccac taagaaagat 
ctagaaaggt tatccaaaca cacccatttt 
caaataccta tgtcaaaagg ggataaagaa 
ttgcttatag acgtatgcct tttggtttat 
tgatggctat actctatgat ttttgtgaaa 
tatttacgaa acttcttttg atgattgctt 
tgaagaaact aatcttgtct tgaactggga 
tgcttgggac ataaaatttc tgaaagaggt 
gttgaaaaga tgccatgtcc caaggacatc 
gggttttata ggaggtttat caaggacttc 



gcaggtataa tctatccagt tgctgatagc 60 
aagaagggag gtatgactgt agttcctaat 12 0 
gttacaggtt ataggatggt aattgatttt 18 0 
caataccctt tgccttttat tgatcaaatg 240 
tgctttctag atggttattc tagtttctct 300 
aagaccactt ttacttgtcc ctttggtact 360 
gtaatgcatc tgctaccttt caaacatgca 42 0 
gaatgttgat gttttcatgg atgatttttg 480 
gagcaacctt gatcgagttt tgcagagatg 54 0 
aaagtcccac tttatggtta atgaaggcat 60 0 
accgaagttg acaaagctaa agttgatgct 660 
aaaggtataa gaagtttcct tggtcatgcc 720 
accaaggtt 759 



<210> 79 

<211> 254 

<212> PRT 

<213> Secale cereale 



77 



<400> 79 

Val Arg Lys Glu Val Phe Lys Phe Leu Glu Ala Gly He He Tyr Pro 
15 10 15 

Val Ala Asp Ser Gin Trp Val Ser Pro Val His Cys Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val Val Pro Asn Asp Lys Asp Glu Leu He Ser Gin 
35 40 45 

Arg He Val Thr Gly Tyr Arg Met Val He Asp Phe Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Lys Lys Asp Gin Tyr Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ser Lys His Thr His Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Ser Phe Ser Gin He Pro Met Ser Lys Gly Asp Lys Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Phe Gly Thr Phe Ala Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Ser Ala Thr Phe Gin Thr Cys Met Met Ala He 
130 135 140 

Leu Tyr Asp Phe Cys Glu Arg He Val Asp Val Phe Met Asp Asp Phe 
145 150 155 160 

Cys He Tyr Glu Thr Ser Phe Asp Asp Cys Leu Ser Asn Leu Asp Arg 
165 170 175 

Val Leu Gin Arg Cys Glu Glu Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Ser His Phe Met Val Asn Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

Glu Arg Gly Thr Glu Val Asp Lys Ala Lys Val Asp Ala Val Glu Lys 
210 215 220 

Met Pro Cys Pro Lys Asp He Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



78 



<210> 80 
<211> 761 
<212> DNA 

<213> Triticum aestivum 
<400> 80 

gtgcgtaagg aggttctcaa gtttctggag 
cagtgggtaa gtcctgtcca ttgtgtccct 
gataaagatg aattgattcc tcaaagaatt 
gcaaattaaa taaagccact aagagagatc 
tagaaagatt atgcaaacat acacattatt 
aaatacctgt gtcggctaaa gatcaatcaa 
ttgcttatag atgtatgcct tttggtttat 
tgatggctat attctctgat ttttgtgaaa 
ccgtctatgg ttcctctttt gatgattgct 
gtgaagaaac taatcttgtc ttgaattggg 
ttgtcttggg gcacaaagtt tctgaaagag 
ctattgaaaa gataccatgt cccaaggaca 
ccggatttta taggaggttc ataaaagatt 



gtaggtataa tttatcccgt tgctgatagt 60 
aagaagggag gtattactgt tgtccctaat 12 0 
attacggtta taggatggta attgatttcc 180 
attacccctt accttttatt gatcaaattc 240 
gcttccaaga tggttatcct ggtttttctc 300 
agactacttt tacatgccct tttggtactt 360 
gtaatgcacc tgctaccttt caaagatgca 420 
agatttgtga ggttttcatg gatgactttt 480 
tgagcaatct tgatcgagtt ttgcagagat 540 
aaaagtgtca ctttatggtt aatgaaggta 600 
gtattgaagt tgataaagcc aaggttgaca 660 
tcaaaggtac aagaagtttc cttggtcacg 72 0 
tcacaaaggt t 761 



<210> 81 
<211> 254 
<212> PRT 

<213> Triticum aestivum 
<400> 81 

Val Arg Lys Glu Val Leu Lys Phe Leu Glu Val Gly lie lie Tyr Pro 
IS 10 15 

Val Ala Asp Ser Gin Trp Val Ser Pro Val His Cys Val Pro Lys Lys 
20 25 30 

Gly Gly lie Thr Val Val Pro Asn Asp Lys Asp Glu Leu He Pro Gin 
35 40 45 

Arg He He Thr Gly Tyr Arg Met Val He Asp Phe Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Lys Arg Asp His Tyr Pro Leu Pro Phe He Asp Gin He 
^5 70 75 80 

Leu Glu Arg Leu Cys Lys His Thr His Tyr Cys Phe Gin Asp Gly Tyr 
85 90 95 

Pro Gly Phe Ser Gin He Pro Val Ser Ala Lys Asp Gin Ser Lys Thr 



79 



100 



105 



110 



Thr Phe Thr Cys Pro Phe Gly Thr Phe Ala Tyr Arg Cys Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Met Ala He 
130 135 140 

Phe Ser Asp Phe Cys Glu Lys He Cys Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Ser Ser Phe Asp Asp Cys Leu Ser Asn Leu Asp Arg 
165 170 175 

Val Leu Gin Arg Cys Glu Glu Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Asn Glu Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Glu Arg Gly He Glu Val Asp Lys Ala Lys Val Asp Thr He Glu Lys 
210 215 220 

He Pro Cys Pro Lys Asp He Lys Gly Thr Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 82 
<211> 780 
<212> DNA 

<213> Triticum aestivurn 
<400> 82 

gtgcggaagg aggtgtttaa gctccttgag gcaggtataa tttatcccgt tgctgatagt 60 
aagtgggtaa ttcctgtcca ttaagtgatc gtgattactg ttgttcctaa gaagggaggt 120 
attaccgttg ttcctaatga taaagatgaa ttgattcctc aaagaaccat tactggttat 180 
aggatggtaa ttgatttccg caaattaaat aaggctacta aaaaatatca ttacccctta 24 0 
ccttttatcg atcaaatgct agaaagatta tccaaacata cacatttttg ctttctagat 300 
ggttactctg gtttctctca aatacctgtg tcagccaaag atcaatcaaa gactactttt 360 
acatgccctt ttggtacttt tgcttataga cgtatgcctt ttggtttatg taatgcacct 420 
gctacctttc aaagatacat gatggctata ttatctgact tttgtgaaaa gatttgtgag 480 
gttttcatgg acgactcttc catctatgga tcttcttttg atgattgctt gagcaacctt 540 
gatcgagttt tgcagagatg tgaagaaact tatcttgtct tgaattggga aaagtgccaa 600 
tttatggtta atgaaggtat tgtcctgggg cataaagttt ctgaaagagg tattcgagtt 660 
gataaagcca aggttgatgc tattgaaaag atgccatgtc ccatggacat caaaggtata 720 

80 



agaagtttcc ttggtcatgc cggtttttat aggaggttca taaaagactt cacgaaggtt 78 0 



<210> 83 
<211> 260 
<212> PRT 

<213> Triticum aestivum 



<400> 83 

Val Arg Lys Glu Val Phe Lys Leu Leu Glu Ala Gly He He Tyr Pro 
15 10 15 

Val Ala Asp Ser Lys Trp Val He Pro Val His Glx Val He Val He 
20 25 30 

Thr Val Val Pro Lys Lys Gly Gly He Thr Val Val Pro Asn Asp Lys 
35 40 45 

Asp Glu Leu He Pro Gin Arg Thr He Thr Gly Tyr Arg Met Val He 
50 55 60 

Asp Phe Arg Lys Leu Asn Lys Ala Thr Lys Lys Tyr His Tyr Pro Leu 
65 70 75 80 

Pro Phe He Asp Gin Met Leu Glu Arg Leu Ser Lys His Thr His Phe 
85 90 95 

Cys Phe Leu Asp Gly Tyr Ser Gly Phe Ser Gin He Pro Val Ser Ala 
100 105 110 

Lys Asp Gin Ser Lys Thr Thr Phe Thr Cys Pro Phe Gly Thr Phe Ala 
115 120 125 



Tyr Arg Arg Met Pro Phe Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin 
130 135 140 



Arg Tyr Met Met 
145 

Val Phe Met Asp 



Leu Ser Asn Leu 
180 

Val Leu Asn Trp 
195 



Ala He Leu Ser 
150 

Asp Ser Ser He 
165 

Asp Arg Val Leu 



Glu Lys Cys Gin 
200 



Asp Phe Cys Glu 
155 

Tyr Gly Ser Ser 
170 

Gin Arg Cys Glu 
185 

Phe Met Val Asn 



Lys He Cys Glu 
160 

Phe Asp Asp Cys 
175 

Glu Thr Tyr Leu 
190 

Glu Gly He Val 
205 
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Leu Gly His Lys Val Ser Glu Arg Gly lie Arg Val Asp Lys Ala Lys 
210 215 220 



Val Asp Ala lie Glu Lys Met Pro Cys Pro Met Asp lie Lys Gly lie 
225 230 235 240 

Arg Ser Phe Leu Gly His Ala Gly Phe Tyr Arg Arg Phe He Lys Asp 
245 250 255 

Phe Thr Lys Val 
260 



<210> 84 
<211> 762 
<212> DNA 

<213> Triticura aestivum 
<400> 84 

gtgcgtaagg aggtattcaa gcttctggag gcaggtataa tttatcccgt tgttgatagt 60 
caatgggtaa gtcctgtcca ttgtgtcctt aagaagggag gtattactgt tgtccctaat 12 0 
gataaagatg aattgattcc gcaaagaatt atcacaggtt ataggatggt aattgatttc 180 
cgtaagttaa ataaagctac taagaaagat cattacccct taccttttat tgatcaaatg 24 0 
ttagaaagat tatgcaaaca tacacattat tgctttctag atggttattc tggtttctct 300 
caaatacctg tgtcagctaa ggatcaatca aagactactt ttacatgccc ttttggtact 360 
tttggttata gacgtatgcc tttcgattta tgtaatgcac ctgctacctt tcaaatatgc 42 0 
atgatggcta tattctctga cttttgcgaa aagatttgtg aggttttcat ggacgacttt 480 
tccgtctatg gttcctctta tgatgattgc ttgagcaatc ttaatcgagt tttgcagaga 54 0 
tgtgaagaaa ctaatcttgt cttgaattgg gaaaagtgcc actttatggt taatgaaggt 600 
attgtcttgg ggcacaaagt ttctgaacga ggtattgaag ttgataaggc caaggttgat 660 
gctattgaaa agatgacatg tcccaaggac atcaaaggta taagaagttt ccttggtcac 720 
gccagatttt ataggaggtt cataaaagac ttcacaaagg tt 762 



<210> 85 

<211> 254 

<212> PRT 

<213> Triticum aestivum 



<400> 85 

Val Arg Lys Glu Val Phe Lys Leu 
1 5 

Val Val Asp Ser Gin Trp Val Ser 
20 

Gly Gly He Thr Val Val Pro Asn 
35 40 



Leu Glu Ala Gly He He Tyr Pro 
10 15 

Pro Val His Cys Val Leu Lys Lys 
25 30 

Asp Lys Asp Glu Leu He Pro Gin 
45 



82 



Arg lie He Thr Gly Tyr Arg Met Val He Asp Phe Arg Lys Leu Asn 
50 55 60 



Lys Ala Thr Lys Lys Asp His Tyr Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Cys Lys His Thr His Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Phe Ser Gin He Pro Val Ser Ala Lys Asp Gin Ser Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Phe Gly Thr Phe Gly Tyr Arg Arg Met Pro Phe 
115 120 125 

Asp Leu Cys Asn Ala Pro Ala Thr Phe Gin He Cys Met Met Ala He 
130 135 140 

Phe Ser Asp Phe Cys Glu Lys He Cys Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Ser Ser Tyr Asp Asp Cys Leu Ser Asn Leu Asn Arg 
165 170 175 

Val Leu Gin Arg Cys Glu Glu Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Asn Glu Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Glu Arg Gly He Glu Val Asp Lys Ala Lys Val Asp Ala He Glu Lys 
210 215 220 

Met Thr Cys Pro Lys Asp He Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Arg Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 86 
<211> 762 
<212> DNA 

<213> Triticum aestivum 
<400> 86 

gtgcggaaag aggtgctcaa gcttctggag gcaggtataa tttatcccgt tgctgagagt 



83 



cagtgggtaa gtcctgtcca ttgtgtccct 
gataaagatg aattgattcc tcaaagaatt 
cgcaaattaa ataaagccac caagaaagat 
ctagaaagat tatgcaaaca tacacattat 
caaatacctg tgtcggctaa agatcaatca 
tttgcttata gacgtatgcc ttttggttta 
atgatggcta tattctctga tttttgtgaa 
tccgtctatg gttcctcttt tgatgattgc 
tgtgaagaaa ctaatcttgt cttgaattgg 
attgtcttgg ggcacaaagt ttctgaaaga 
gctattgaaa agatgccatg tcccaaggac 
gccggatttt ataggaggtt cataaaagac 



aagaagggag gtattactgt tgtccctaat 12 0 
attacaggtt ataggatggt aattgatttc 180 
cattacccct taccttttat tgatcaaatg 240 
tgcttcctag atggttattc tggtttctct 3 00 
aagactactt ttacatgccc ttttggtact 360 
tgtaatgcac cttctacctt tcaaagatgc 42 0 
aagatttgtg aggttttcat ggacgaattt 480 
ttgagcaatc ctgatcgagt tttgcagaga 540 
gaaaagtgcc actttatggt taatgaaggt 600 
ggtattgaag ttgataaagc caaggttgac 660 
atcaaaggta taagaagttt ccttggtcac 72 0 
ttcacaaagg tt 762 



<210> 87 
<211> 254 
<212> PRT 

<213> Triticum aestivum 



<400> 87 
Val Arg Lys Glu 
1 

Val Ala Glu Ser 
20 

Gly Gly He Thr 
35 

Arg He He Thr 
50 

Lys Ala Thr Lys 
65 

Leu Glu Arg Leu 



Ser Gly Phe Ser 
100 

Thr Phe Thr Cys 
115 

Gly Leu Cys Asn 
130 



Val Leu Lys Leu 
5 

Gin Trp Val Ser 



Val Val Pro Asn 
40 

Gly Tyr Arg Met 
55 

Lys Asp His Tyr 
70 

Cys Lys His Thr 
85 

Gin He Pro Val 



Pro Phe Gly Thr 
120 

Ala Pro Ser Thr 
135 



Leu Glu Ala Gly 
10 

Pro Val His Cys 
25 

Asp Lys Asp Glu 



Val He Asp Phe 
60 

Pro Leu Pro Phe 
75 

His Tyr Cys Phe 
90 

Ser Ala Lys Asp 
105 

Phe Ala Tyr Arg 

Phe Gin Arg Cys 
140 



He He Tyr Pro 
15 

Val Pro Lys Lys 
30 

Leu He Pro Gin 
45 

Arg Lys Leu Asn 



He Asp Gin Met 
80 

Leu Asp Gly Tyr 
95 

Gin Ser Lys Thr 
110 

Arg Met Pro Phe 
125 

Met Met Ala He 



Phe Ser Asp Phe Cys Glu Lys He Cys Glu Val Phe Met Asp Glu Phe 



84 



145 



150 



155 



160 



Ser Val Tyr Gly Ser Ser Phe Asp Asp Cys Leu Ser Asn Pro Asp Arg 
165 170 175 

Val Leu Gin Arg Cys Glu Glu Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Asn Glu Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Glu Arg Gly He Glu Val Asp Lys Ala Lys Val Asp Ala He Glu Lys 
210 215 220 

Met Pro Cys Pro Lys Asp He Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 88 
<211> 762 
<212> DNA 

<213> Triticum aestivum 
<400> 88 

gtgcgtaagg aggttttcaa gttccttgag gcaggtatta cttatcccgt tgctgatagt 60 
gaatgggtaa gccctctcca ttgtgttcct aaaaagggag gtattaccgt tgttcttaat 120 
gataaagatg aattgatccc gcaaataatt attacaggtt ataggatggt aattgatttc 180 
cataagttaa ataaagctac taagaaagat cattaccctt tacctcttat tgatcaaatt 24 0 
ctagaaagac tatccaaaca cacacatttc tgctttctag atggttatac tggtttctct 300 
caaatacctg tgtcagtgaa ggatcaatct aaaactactt ttacttgccc ttttggtact 360 
tttgcttata gacttatgcc ttttggttta tgtaatgcac ctacttcctt tcaaagatgc 420 
atgatggcta tattctctgt tttttgtgaa aatatttgtg aggtattcat ggatgatttc 480 
tccgtttatg gatcctcttt tgatgattgt ttgagcaacc ttgatcgagt tttgcagaga 54 0 
tgcgaagaca ctagtctcat cctgaattgg gaaaagtgtc actttatggt taatgaaggc 600 
attgtcttgg ggcataagat ttccgagaga ggtattgaag ttgacaaagc caaagttgat 660 
gctattgaaa agattccatg tcccaaggac ataaaaggta taagaagttt ccttggtcat 72 0 
gctggttttt ataggaggtt catcaaagac ttctcaaagg tt 762 

<210> 89 
<211> 254 
<212> PRT 

<213> Triticum aestivum 
<400> 89 



85 



Val Arg Lys Glu Val Phe Lys Phe Leu Glu Ala Gly He Thr Tyr Pro 
15 10 15 

Val Ala Asp Ser Glu Trp Val Ser Pro Leu His Cys Val Pro Lys Lys 
20 25 30 

Gly Gly He Thr Val Val Leu Asn Asp Lys Asp Glu Leu He Pro Gin 
35 40 45 

He He He Thr Gly Tyr Arg Met Val He Asp Phe His Lys Leu Asn 
50 55 60 

Lys Ala Thr Lys Lys Asp His Tyr Pro Leu Pro Leu He Asp Gin He 
65 70 75 80 

Leu Glu Arg Leu Ser Lys His Thr His Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 

Thr Gly Phe Ser Gin He Pro Val Ser Val Lys Asp Gin Ser Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Phe Gly Thr Phe Ala Tyr Arg Leu Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Thr Ser Phe Gin Arg Cys Met Met Ala He 
130 135 140 

Phe Ser Val Phe Cys Glu Asn He Cys Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Ser Ser Phe Asp Asp Cys Leu Ser Asn Leu Asp Arg 
165 170 175 

Val Leu Gin Arg Cys Glu Asp Thr Ser Leu He Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Asn Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

Glu Arg Gly He Glu Val Asp Lys Ala Lys Val Asp Ala He Glu Lys 
210 215 220 

He Pro Cys Pro Lys Asp He Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 



Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Ser Lys Val 
245 250 



86 



<210> 90 
<211> 791 
<212> DNA 

<213> Gossypium hirsutum 
<400> 90 

gtgcgcaagg aggttttaaa gctacttgat 
aattgggtta gcccagtaca catagtacca 
tcggcaggtg agatagttcc cactcgggtc 
aggaagttga attccttaac tcggaaggat 
ttagaacgtt tagctggaaa gtctcattat 
tgttgtttgg atggttacta aggttttttc 
agacaatgtt tacgtgccca tttggcacgt 
gtaatgcacc agccagtttt cataggtgca 
aaattatcga ggtgttcatg gacgacttta 
tgacgaacct tgcaaaaatt ttggaaagat 
agaaatgcca ttttatggta gacaagggat 
gaatttctgt tgataaagca aaaatcaaca 
tgagggagat ttggtctttc cttggtcatg 
tttcaaaagt t 



gacgggatga tctatcccat atctaacagt 60 
aaaaagacca gtgcaaccgt aatcgagaat 12 0 
caaaacgggt ggagagtatg catcgattac 180 
cactttccac ttccttttat tgaccagatg 240 
ttagaacgtt tagctggaaa gtctcattat 300 
cagatcccag tggcaccgga ggatcaagaa 360 
tttcttacag acggatgccg ttcggactct 420 
tggtaagtat attttcagac tacgtcgata 480 
ctgtatatgg tgagtccttc gaggtaagtc 54 0 
gcttagaatt taatcttgtt ctaaattatg 600 
tagttctagg tcatattatt tctgctgatg 660 
tcattaactc actaccatac cccacaactg 720 
caggtttcta caagtggttc atcaaagact 780 

791 



<210> 91 
<211> 264 
<212> PRT 

<213> Gossypium hirsutum 



<400> 91 
Val Arg Lys Glu 
1 

lie Ser Asn Ser 
20 

Thr Ser Ala Thr 
35 

Arg Val Gin Asn 
50 



Val Leu Lys Leu 
5 

Asn Trp Val Ser 



Val He Glu Asn 
40 

Gly Trp Arg Val 
55 



Leu Asp Asp Gly 
10 

Pro Val His He 
25 

Ser Ala Gly Glu 



Cys He Asp Tyr 
60 



Met He Tyr Pro 
15 

Val Pro Lys Lys 
30 

He Val Pro Thr 
45 

Arg Lys Leu Asn 



Ser Leu Thr Arg Lys Asp His Phe Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly Lys Ser His Tyr Leu Glu Arg Leu Ala Gly 
85 90 95 

Lys Ser His Tyr Cys Cys Leu Asp Gly Tyr Glx Gly Phe Phe Gin He 



87 



100 



105 



110 



Pro Val Ala Pro 
115 

Gly Thr Phe Ser 
130 

Ala Ser Phe His 
14 5 

Lys lie He Glu 



Phe Glu Val Ser 
180 

Glu Phe Asn Leu 
195 

Lys Gly Leu Val 
210 

Asp Lys Ala Lys 
225 

Val Arg Glu He 



Phe He Lys Asp 
260 



Glu Asp Gin Glu 
120 

Tyr Arg Arg Met 
135 

Arg Cys Met Val 
150 

Val Phe Met Asp 
165 

Leu Thr Asn Leu 



Val Leu Asn Tyr 
200 

Leu Gly His He 
215 

He Asn He He 
230 

Trp Ser Phe Leu 
245 

Phe Ser Lys Val 



Lys Thr Met Phe 



Pro Phe Gly Leu 
140 

Ser He Phe Ser 
155 

Asp Phe Thr Val 
170 

Ala Lys He Leu 
185 

Glu Lys Cys His 



He Ser Ala Asp 
220 

Asn Ser Leu Pro 
235 

Gly His Ala Gly 
250 



Thr Cys Pro Phe 
125 

Cys Asn Ala Pro 



Asp Tyr Val Asp 
160 

Tyr Gly Glu Ser 
175 

Glu Arg Cys Leu 
190 

Phe Met Val Asp 
205 

Gly He Ser Val 

Tyr Pro Thr Thr 
240 

Phe Tyr Lys Trp 
255 



<210> 92 
<211> 763 
<212> DNA 

<213> Gossypium hirsutum 
<400> 92 

gtgcgtaaag aggtcgtaaa gctacttgat 
aattgggtta gtccagtcca catagtaccc 
tcagcaggtg agatggttcc cacttaagtc 
aggaagttga attccttaac tcggaaagat 
ttagaacatt tagccagaaa gtctcattat 
cagatcccaa tggcactaaa ggatcaagaa 
ttcgcttata gaaggatgtc gtttcagact 
catgataagt atattttttg actatgttaa 
tactgtatat agtgagtcct tcgaggtata 



tccgggatga tctatcccat atctgacaat 6 0 
aaaaagaccg gtgtaaccgt aattgagaat 12 0 
cgaaacggtc ggagagtatg catcgattac 18 0 
cactttccac ttctttttat tgatcagatg 240 
tgttgtctgg atggttactc aggttttttc 3 00 
aagatgacat ttacgtgccc atttggcatg 3 60 
ttgcaatgca ccaaccatgt ttcagaggtg 420 
gaaaataatt gaggtgttca tggacgaatt 4 80 
tttgtcaaat ctagaaaaat ttttggaaag 54 0 



88 



atgcttagaa tttaatcttg ttctaaatta 
attagttcta ggtcatatca tttctgctaa 
catcataagc tcaataccat accccacaac 
tataggtttc tataggcgat tcatcaagga 



tgagaattgc tatttaatgg tagacaaggg 600 
gggaatttct gtcgataaag taaaaattaa 660 
tgtgagggag attcgttctt tccttagtca 720 
cttttcaaaa gtt 763 



<210> 93 
<211> 254 
<212> PRT 

<213> Gossypium hirsutum 
<400> 93 

Val Arg Lys Glu Val Val Lys Leu Leu Asp Ser Gly Met lie Tyr Pro 
1 5 10 15 

He Ser Asp Asn Asn Trp Val Ser Pro Val His He Val Pro Lys Lys 
20 25 30 

Thr Gly Val Thr Val He Glu Asn Ser Ala Gly Glu Met Val Pro Thr 
35 40 45 

Glx Val Arg Asn Gly Arg Arg Val Cys He Asp Tyr Arg Lys Leu Asn 
50 55 60 

Ser Leu Thr Arg Lys Asp His Phe Pro Leu Leu Phe He Asp Gin Met 
65 70 75 80 

Leu Glu His Leu Ala Arg Lys Ser His Tyr Cys Cys Leu Asp Gly Tyr 
85 90 95 

Ser Gly Phe Phe Gin He Pro Met Ala Leu Lys Asp Gin Glu Lys Met 
100 105 110 

Thr Phe Thr Cys Pro Phe Gly Met Phe Ala Tyr Arg Arg Met Ser Phe 
115 120 125 

Arg Leu Cys Asn Ala Pro Thr Met Phe Gin Arg Cys Met He Ser He 
130 135 140 

Phe Phe Asp Tyr Val Lys Lys He He Glu Val Phe Met Asp Glu Phe 
145 150 155 160 

Thr Val Tyr Ser Glu Ser Phe Glu Val Tyr Leu Ser Asn Leu Glu Lys 
165 170 175 

Phe Leu Glu Arg Cys Leu Glu Phe Asn Leu Val Leu Asn Tyr Glu Asn 
180 185 190 



89 



Cys Tyr Leu Met Val Asp Lys Gly Leu Val Leu Gly His He He Ser 
195 200 205 



Ala Lys Gly He Ser Val Asp Lys Val Lys He Asn He He Ser Ser 
210 215 220 

He Pro Tyr Pro Thr Thr Val Arg Glu He Arg Ser Phe Leu Ser His 
225 230 235 240 

He Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Ser Lys Val 
245 250 



<210> 94 
<211> 723 
<212> DNA 

<213> Gossypium hirsutum 
<400> 94 

gtgcgtaagg aggttttgaa attgttggat gctggaatga tatactcgat ctttgacagt 60 
gattgggtta gctgggttca tgtcgtgcca aagaaaactg gcgtgacagt ggtgaaaaac 12 0 
tcatcaggag agctagtccc tacccgagtc cagaatcgat ggagggtttg catcgattac 180 
aggaagttga acgcagctac ccgaaatgac cattttccac ttcccttcat tgatcaaatg 24 0 
ctcgagcgat tagctaataa gacccattat tgttgtctcg atgggtactc aggacttttc 300 
caaattccgg tggcacctga ggatcaagac aaaacaactt tcacgtgccc ctttggaacg 360 
tttgcgtata gaagaatgtc gtttggactc tgtaatgctc cggccacttt ccagagatgt 42 0 
atggtgagca tattctctga ttatgtcgag aaaatcattg aattcttcat ggatgacttc 480 
acggtgtacg gtaactcttt taacgaatgt ctcgataatc ttgctaagat attacagaga 54 0 
tgcctagaat ttaatcttgt tttaaattat gaaaaatgcc acttcatggt tgacaaagga 600 
ttaattttgg gtcatatagt ttcttcagaa ggtattgagg tcaataaagc aaaaacgaat 660 
attattgact cattacctta ccccagattt tacagacgat tcataaagga cttcacaaaa 72 0 
9 fct 723 



<210> 95 
<211> 241 
<212> PRT 

<213> Gossypium hirsutum 
<400> 95 

Val Arg Lys Glu Val Leu Lys Leu Leu Asp Ala Gly Met He Tyr Ser 
15 10 15 

He Phe Asp Ser Asp Trp Val Ser Trp Val His Val Val Pro Lys Lys 
20 25 30 

Thr Gly Val Thr Val Val Lys Asn Ser Ser Gly Glu Leu Val Pro Thr 
35 40 45 
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Arg Val Gin Asn Arg Trp Arg Val Cys He Asp Tyr Arg Lys Leu Asn 
50 55 60 



Ala Ala Thr Arg Asn Asp His Phe 
65 70 

Leu Glu Arg Leu Ala Asn Lys Thr 
85 

Ser Gly Leu Phe Gin He Pro Val 
100 



Pro Leu Pro Phe He Asp Gin Met 
75 80 

His Tyr Cys Cys Leu Asp Gly Tyr 
90 95 

Ala Pro Glu Asp Gin Asp Lys Thr 
105 no 



Thr Phe Thr Cys Pro Phe Gly Thr Phe Ala Tyr Arg Arg Met Ser Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Val Ser He 
130 135 140 

Phe Ser Asp Tyr Val Glu Lys He He Glu Phe Phe Met Asp Asp Phe 
14 5 150 155 160 



Thr Val Tyr Gly Asn Ser Phe Asn 
165 

He Leu Gin Arg Cys Leu Glu Phe 
180 

Cys His Phe Met Val Asp Lys Gly 
195 200 

Ser Glu Gly He Glu Val Asn Lys 
210 215 

Leu Pro Tyr Pro Arg Phe Tyr Arg 
225 230 



Glu Cys Leu Asp Asn Leu Ala Lys 
170 175 

Asn Leu Val Leu Asn Tyr Glu Lys 
185 190 

Leu He Leu Gly His He Val Ser 
205 

Ala Lys Thr Asn He He Asp Ser 
220 

Arg Phe He Lys Asp Phe Thr Lys 
235 240 



Val 



<210> 96 

<211> 762 

<212> DNA 

<213> Lycopersicon 



esculentum 



<400> 96 

gtgcggaaag aggttgtgaa gctgttagat acgggtattg tctagccaat ttcggacaac 6 0 



91 



aagtaggtta gtccagtaca atgtgaacct aaaaagggag acataacggt gatcactaat 12 0 
gaaaaaaatg agttgatccc aaccatgata gtcacataat ggagaatatg catggattac 180 
aggaaattga atgaagccac caggaaggac cattacccgg tcccttttat tgatcagatg 24 0 
ttggaccggt tggctgggga ataatattat tgttttctta atggctattt acggtacaac 300 
caaattgtga tttcaccaaa ggattaagag aaaaccactt tcacttgccc gtatggtaca 360 
tatgctttca aaaagatacc ttttgggtta tgaaatgcct cggctacttt ccaatgatgc 42 0 
atgatggcta tttttcatga tatggttgaa gattttgttg agatattcat gaatgatttc 480 
tcagtgtttg gggattcttt tgatatgtgc ttggagaatt tggacagtgt gttggctagt 540 
tgtgaagaaa ctaatctttt cctaaactgg gaataatagc aatttctagt aaaggaaggg 600 
attatgctag gacataaggt gtcaaagaga ggtatggaag ttgatagtgc caaagtggag 660 
gttattgaaa agcttccccc tcctatatct gttaaaggga tgcaaagttt tctgggtcat 720 
gttgggttct ataggagatt cataaaagac ttcacaaagg tt 762 

<210> 97 
<211> 254 
<212> PRT 

<213> Lycopersicon esculentum 
<400> 97 

Val Arg Lys Glu Val Val Lys Leu Leu Asp Thr Gly He Val Glx Pro 
15 10 15 

He Ser Asp Asn Lys Glx Val Ser Pro Val Gin Cys Glu Pro Lys Lys 
20 25 30 

Gly Asp He Thr Val He Thr Asn Glu Lys Asn Glu Leu He Pro Thr 
35 40 45 

Met He Val Thr Glx Trp Arg He Cys Met Asp Tyr Arg Lys Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Asp His Tyr Pro Val Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Asp Arg Leu Ala Gly Glu Glx Tyr Tyr Cys Phe Leu Asn Gly Tyr 
85 90 95 

Leu Arg Tyr Asn Gin He Val He Ser Pro Lys Asp Glx Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Tyr Gly Thr Tyr Ala Phe Lys Lys He Pro Phe 
115 120 125 

Gly Leu Glx Asn Ala Ser Ala Thr Phe Gin Glx Cys Met Met Ala He 
130 135 140 

Phe His Asp Met Val Glu Asp Phe Val Glu He Phe Met Asn Asp Phe 



92 



145 



150 



155 



160 



Ser Val Phe Gly Asp Ser Phe Asp 
165 

Val Leu Ala Ser Cys Glu Glu Thr 
180 

Glx Gin Phe Leu Val Lys Glu Gly 

195 200 

Lys Arg Gly Met Glu Val Asp Ser 
210 215 

Leu Pro Pro Pro lie Ser Val Lys 
225 230 

Val Gly Phe Tyr Arg Arg Phe lie 
245 



Met Cys Leu Glu Asn Leu Asp Ser 
170 175 

Asn Leu Phe Leu Asn Trp Glu Glx 
185 190 

lie Met Leu Gly His Lys Val Ser 
205 

Ala Lys Val Glu Val lie Glu Lys 
220 

Gly Met Gin Ser Phe Leu Gly His 
235 240 

Lys Asp Phe Thr Lys Val 
250 



<210> 98 
<211> 689 
<212> DNA 

<213> Lycopersicon esculentum 
<400> 98 

cgaaaggagg tggtgaaact ggaaattatc 
atcgccgata gtagttgggt atgcctagtt 
gtggtcccca acgaaaagaa tgaacttgtt 
tgcatggatt accgtaaact gaactcatag 
atggatcaga tgttggatag acttgccgga 
tcggggtata atcagatttc tattgcacca 
ccatacggga cttttgcatt cagaagaatg 
tttcagagat ggatgatgtc aatattttct 
atggatgatt tttctgtggt tggtgattca 
gttcttaaga gatgtgaaga ctgcaatttg 
gtgaaagagg gtattgtgtt gggtcatcgc 
ggtgattcat caaagacttc acaaaggtt 



aagtagttgg atgctagagt aatctatcca 60 
cagtgtgtac caaagaaagg gggaatgact 12 0 
cgaatgagac cggttactgg atggagggtg 18 0 
actgaaaaag actattttca tatgcccttc 240 
aaagggtggt attgttttct tgatgggtat 3 00 
gaagatcaag agaaaaccac tttcacttgt 360 
tcgtttgggt tgtgcaatgc acccgcaacc 42 0 
gacatgatgg aggatactat agaggttttt 48 0 
ttcgagcggt gcttgtccaa tttatctgag 540 
gtactaaact gggaaaagtg tcatttcatg 600 
atttcagaaa agggcatgca tgtttttact 660 

689 



<210> 99 
<211> 229 
<212> PRT 

<213> Lycopersicon esculentum 
<400> 99 

Arg Lys Glu Val Val Lys Leu Glu lie lie Lys Glx Leu Asp Ala Arg 
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1 



5 



10 



15 



Val He Tyr Pro He Ala Asp Ser Ser Trp Val Cys Leu Val Gin Cys 
20 25 30 

Val Pro Lys Lys Gly Gly Met Thr Val Val Pro Asn Glu Lys Asn Glu 
35 40 45 

Leu Val Arg Met Arg Pro Val Thr Gly Trp Arg Val Cys Met Asp Tyr 
50 55 60 

Arg Lys Leu Asn Ser Glx Thr Glu Lys Asp Tyr Phe His Met Pro Phe 
65 70 75 80 

Met Asp Gin Met Leu Asp Arg Leu Ala Gly Lys Gly Trp Tyr Cys Phe 
85 90 95 

Leu Asp Gly Tyr Ser Gly Tyr Asn Gin He Ser He Ala Pro Glu Asp 
100 105 110 

Gin Glu Lys Thr Thr Phe Thr Cys Pro Tyr Gly Thr Phe Ala Phe Arg 
115 120 125 

Arg Met Ser Phe Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Trp 
130 135 140 

Met Met Ser He Phe Ser Asp Met Met Glu Asp Thr He Glu Val Phe 
145 150 155 160 

Met Asp Asp Phe Ser Val Val Gly Asp Ser Phe Glu Arg Cys Leu Ser 
165 170 175 

Asn Leu Ser Glu Val Leu Lys Arg Cys Glu Asp Cys Asn Leu Val Leu 
180 185 190 

Asn Trp Glu Lys Cys His Phe Met Val Lys Glu Gly He Val Leu Gly 
195 200 205 

His Arg He Ser Glu Lys Gly Met His Val Phe Thr Gly Asp Ser Ser 
210 215 220 

Lys Thr Ser Gin Arg 
225 



<210> 100 
<211> 760 
<212> DNA 



94 



<213> Lycopersicon esculentum 



<400> 100 

gtgcgtaagg aggtgtttaa gcttctagat 
agtgggttag tctagtacaa tgtgtaccta 
aaaacaatga gtttatccca accagcacag 
cgaagttaat gaagccacta ggaagaatca 
ggaccggtta gctgggcaag aatattattg 
aattttgatt gcaccagagg atcaagagaa 
tgctttcaag aggatacctt ttgggttatg 
gatgactatt tttcatgata tggttgaata 
agtgttttgg gagtcttttg atagatgctt 
cgaacaaact aatcttgtcc tgaactggga 
tttttcgggg cataaggtgt aaaagatagg 
aattgaaaag atctcctctc ccatttttgt 
tgagttttac aggatattca tcaaggactt 



gcgggtattg tctacccaat taggacaaca 60 
aaaagggagg catggcaatg attactaatg 12 0 
tcacaagatg gcgaatatgc atgaattaca 180 
ttacccaatt ctttttattg attatatgtt 240 
ttttttggat tactaatcag ggtacaacta 300 
aacaactttc acttgcccgt atggtacata 360 
caatgctctg tctaatttcc aaagatgcat 420 
ttttgaggat atattcatgg atgatttctt 480 
ggagaatttg aacaggttgt tagctaggtg 54 0 
aaaatgtcat tttttagtaa aggaagggaa 600 
gctggaagtt gatcatgaca aagtggaagt 660 
gaaacgggtg agaagtttac taggtcatgc 720 
ctcaaaggtt 760 



<210> 101 
<211> 254 
<212> PRT 

<213> Lycopersicon esculentum 



<400> 101 
Val Arg Lys Glu 
1 

lie Ser Asp Asn 
20 

Gly Gly Met Ala 
35 

Ser Thr Val Thr 
50 

Glu Ala Thr Arg 
65 

Leu Asp Arg Leu 



Ser Gly Tyr Asn 
100 

Thr Phe Thr Cys 
115 



Val Phe Lys Leu 
5 

Lys Trp Val Ser 



Met lie Thr Asn 
40 

Arg Trp Arg lie 
55 

Lys Asn His Tyr 
70 

Ala Gly Gin Glu 
85 

Glx He Leu He 



Pro Tyr Gly Thr 
120 



Leu Asp Ala Gly 
10 

Leu Val Gin Cys 
25 

Glu Asn Asn Glu 



Cys Met Asn Tyr 
60 

Pro He Leu Phe 
75 

Tyr Tyr Cys Phe 
90 

Ala Pro Glu Asp 
105 

Tyr Ala Phe Lys 



He Val Tyr Pro 
15 

Val Pro Lys Lys 
30 

Phe He Pro Thr 
45 

Thr Lys Leu Asn 



He Asp Tyr Met 
80 

Leu Asp Tyr Glx 
95 

Gin Glu Lys Thr 
110 

Arg He Pro Phe 
125 
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Gly Leu Cys Asn Ala Leu Ser Asn Phe Gin Arg Cys Met Met Thr lie 
130 135 140 



Phe His Asp Met Val Glu Tyr Phe Glu Asp He Phe Met Asp Asp Phe 
145 150 155 160 

Leu Val Phe Trp Glu Ser Phe Asp Arg Cys Leu Glu Asn Leu Asn Arg 
165 170 175 

Leu Leu Ala Arg Cys Glu Gin Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Leu Val Lys Glu Gly Asn Phe Ser Gly His Lys Val Glx 
195 200 205 

Lys He Gly Leu Glu Val Asp His Asp Lys Val Glu Val He Glu Lys 
210 215 220 

He Ser Ser Pro He Phe Val Lys Arg Val Arg Ser Leu Leu Gly His 
225 230 235 240 

Ala Glu Phe Tyr Arg He Phe He Lys Asp Phe Ser Lys Val 
245 250 



<210> 102 
<211> 776 
<212> DNA 

<213> Lycopersicon esculentum 
<400> 102 

gtgcggaaag aagtgtttaa actggaatca ttaaatggtt ggatgctgga gtaatatatc 60 
cgatctccga tagtagttgg gtatgcccta ttcagtgtgt acctaagaaa gggggaatga 120 
ctgtggtccc caataagaaa aatgaacttg ttctaatgag accggttact ggagggtggg 180 
tgtgtatgga ttaccgtaaa ttaaatgcat ggactgaaaa agaccatttt cctatgccct 24 0 
tcatggatca gatgttggat agacttgccg aaaaagggtg gtactgtttt cttgatggat 3 00 
agtcagggta taattagatt tctattgcac cagaagatca agagaaaacc acatttactt 360 
gtccatatgg gacctttgca ttgaagagaa tgtcgtttgg gttgtgcaat gcacccgcca 42 0 
catttcacag atgtaaaaat gttgatattc ttcgacatgg tggatgatac tattgatgct 480 
tttatggatg atttttctct tgttggtgaa tcattcgaga ggtgtttgaa ccatttatct 54 0 
gatgtcctta agagatgtga agactgcaat ttagtactaa attgggaaaa atgccacttc 600 
atggtgaaaa aaggtattgt tttgggtcat cgcattccag aaaagggcat agaggttgat 660 
cgagctaaag tagaggtaat agagagactt cccccactat ctctgtaaaa ggtgtgagaa 720 
gctttcttgg gcatgcaagt ttttaccgga gattcatcaa agacttcaca aaagtt 776 

<210> 103 
<211> 258 
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<212> PRT 

<213> Lycopersicon esculentum 



<400> 103 

Ala Glu Arg Ser Val Glx Thr Gly lie He Lys Trp Leu Asp Ala Gly 
15 10 15 

Val He Tyr Pro He Ser Asp Ser Ser Trp Val Cys Pro He Gin Cys 
20 25 30 

Val Pro Lys Lys Gly Gly Met Thr Val Val Pro Asn Lys Lys Asn Glu 
35 40 45 

Leu Val Leu Met Arg Pro Val Thr Gly Gly Trp Val Cys Met Asp Tyr 
50 55 60 

Arg Lys Leu Asn Ala Trp Thr Glu Lys Asp His Phe Pro Met Pro Phe 
65 70 75 80 

Met Asp Gin Met Leu Asp Arg Leu Ala Glu Lys Gly Trp Tyr Cys Phe 
85 90 95 

Leu Asp Gly Glx Ser Gly Tyr Asn Glx He Ser He Ala Pro Glu Asp 
100 105 110 

Gin Glu Lys Thr Thr Phe Thr Cys Pro Tyr Gly Thr Phe Ala Leu Lys 
115 120 125 

Arg Met Ser Phe Gly Leu Cys Asn Ala Pro Ala Thr Phe His Arg Cys 
130 135 140 

Lys Met Leu He Phe Phe Asp Met Val Asp Asp Thr He Asp Ala Phe 
145 150 155 160 

Met Asp Asp Phe Ser Leu Val Gly Glu Ser Phe Glu Arg Cys Leu Asn 
165 170 175 

His Leu Ser Asp Val Leu Lys Arg Cys Glu Asp Cys Asn Leu Val Leu 
180 185 190 

Asn Trp Glu Lys Cys His Phe Met Val Lys Lys Gly He Val Leu Gly 
195 200 205 

His Arg He Pro Glu Lys Gly He Glu Val Asp Arg Ala Lys Val Glu 
210 215 220 

Val He Glu Arg Leu Pro Pro Pro He Ser Val Lys Gly Val Arg Ser 
225 230 235 240 
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Phe Leu Gly His Ala Ser Phe Tyr Arg Arg Phe lie Lys Asp Phe Thr 
245 250 255 



Lys Val 



<210> 104 
<211> 761 
<212> DNA 

<213> Solarium tuberosum 
<400> 104 

gtgcggaagg aggtacttaa attgttggat gcacggattg tgtacccaat atcagacagt 60 

aaatgggtaa gtccagtaaa gtgtgtgccc aagaagggca gaatgacggt gttgactaat 120 

gagaagaatg aggtaatccc cacaagaaca gtgactgggt gacggatttg catggactac 18 0 

atgaagttga acgacgccac cagaaaggac cattatccgg tacctttcat tgataaaata 240 

ttggataggt tggcaggaca tgagtactat tgttttcttg gtgtctactc agggtacaat 3 00 

cagattgtta ttgcaataga ggactaggtg aaaaccacct tcacctgttc gtatggcaca 360 

tatgcgttca agcacatgcc attcggcttg tgcaatgccc tggccacatt tcagagatgc 42 0 

atgttggcaa tcttccatga tatggtggag gattttgttg aagttttcat ggatgacttc 480 

ttggtgtttg gtgagtcttt tgaactttgt ttgactaatt ttgacagatt tcttgctagg 54 0 

tgtgaagaga cgaatctggt gataaactga tagaagtgtc actttctggt tcgagaggga 60 0 

attgtgttgg gacacaagat ctccaaaaat gggctgaaag ttgacaaagc caacgtagag 660 

gttattgaga aattgccacc cccatcacag tgaaggtaat taaaagctta ctaggacatg 72 0 
cttggtttta tacgaggttc atcaaagact tcacaaaggt t 761 

<210> 105 
<211> 254 
<212> PRT 

<213> Solarium tuberosum 
<400> 105 

Val Arg Lys Glu Val Leu Lys Leu Leu Asp Ala Arg He Val Tyr Pro 
15 10 15 

He Ser Asp Ser Lys Trp Val Ser Pro Val Lys Cys Val Pro Lys Lys 
20 25 30 

Gly Arg Met Thr Val Leu Thr Asn Glu Lys Asn Glu Val He Pro Thr 
35 40 45 

Arg Thr Val Thr Gly Glx Arg He Cys Met Asp Tyr Met Lys Leu Asn 
50 55 60 

Asp Ala Thr Arg Lys Asp His Tyr Pro Val Pro Phe He Asp Lys He 
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65 



70 



75 



80 



Leu Asp Arg Leu Ala Gly His Glu Tyr Tyr Cys Phe Leu Gly Val Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Val He Ala He Glu Asp Glx Val Lys Thr 
100 105 110 

Thr Phe Thr Cys Ser Tyr Gly Thr Tyr Ala Phe Lys His Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Leu Ala Thr Phe Gin Arg Cys Met Leu Ala He 
130 135 140 

Phe His Asp Met Val Glu Asp Phe Val Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Leu Val Phe Gly Glu Ser Phe Glu Leu Cys Leu Thr Asn Phe Asp Arg 
165 170 175 

Phe Leu Ala Arg Cys Glu Glu Thr Asn Leu Val He Asn Glx Glx Lys 
180 185 190 

Cys His Phe Leu Val Arg Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

Lys Asn Gly Leu Lys Val Asp Lys Ala Asn Val Glu Val He Glu Lys 
210 215 220 

Leu Pro Pro Pro He Thr Val Lys Val He Lys Ser Leu Leu Gly His 
225 230 235 240 



Ala Trp Phe Tyr Thr Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 106 
<211> 760 
<212> DNA 

<213> Solanum tuberosum 
<400> 106 

gtgcgtaaag aggttttcaa actgctagat 
aaatgggtca gcccagttta gtgtgtgcct 
gaaaaaaatg agttgattcc aaccaggaca 
aggaaattga atgaggccac cagaaaggat 
ctggacaggt tagttgggca agaatattat 
caaattgtga ttgcaccaga ggaccaggag 



gtcggtattg tatatccgat ttcagaaagc 60 
aaaaaaagag gcatgccggt gatcaccaat 12 0 
gtgacagggt ggcgaatatg catggattat 18 0 
cactgcccgg ttccttttat tgatcagatg 240 
tgtttcctgg aaggctattc aggatacaac 3 00 
aaaactacat tcacttgtct gtatgggaca 360 
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tatgctttca agtgactgcc gtttgggcta 
atgatggcta tctttcatga tatggttgaa 
tcagtcttta gggagtcttt tgataggtgt 
tgcgaggaaa ctaatctcat cctaaactgg 
attgtattgg gccataaggt gtcaaagaga 
gttattgaaa aactacctcc tccaatctgt 
tggtttttac aggagattta taaaggactt 



tgcaatgctc cagccacctt ccaaagatga 42 0 
gattttgtgg agatattcat ggatgacttc 48 0 
ttggagaatt gggacagggt gctggctaga 54 0 
aaaaaatgtc atttcctagt aaatgaaggg 600 
gggctggaag ttgatcgtgc caaagtggaa 660 
taaaggggtg agaagctttc tgggtcatgc 72 0 
cacaaaggtt 760 



<210> 107 
<211> 254 
<212> PRT 

<213> Solanum tuberosum 
<400> 107 

Val Arg Lys Glu Val Phe Lys Leu Leu Asp Val Gly lie Val Tyr Pro 
15 10 15 

lie Ser Glu Ser Lys Trp Val Ser Pro Val Glx Cys Val Pro Lys Lys 
20 25 30 

Arg Gly Met Pro Val He Thr Asn Glu Lys Asn Glu Leu He Pro Thr 
35 40 45 

Arg Thr Val Thr Gly Trp Arg He Cys Met Asp Tyr Arg Lys Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Asp His Cys Pro Val Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Asp Arg Leu Val Gly Gin Glu Tyr Tyr Cys Phe Leu Glu Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Val He Ala Pro Glu Asp Gin Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Leu Tyr Gly Thr Tyr Ala Phe Lys Glx Leu Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Glx Met Met Ala He 
130 135 140 

Phe His Asp Met Val Glu Asp Phe Val Glu He Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Phe Arg Glu Ser Phe Asp Arg Cys Leu Glu Asn Trp Asp Arg 
165 170 175 
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Val Leu Ala Arg Cys Glu Glu Thr Asn Leu lie Leu Asn Trp Lys Lys 
180 185 190 



Cys His Phe Leu Val Asn Glu Gly 
195 200 

Lys Arg Gly Leu Glu Val Asp Arg 
210 215 

Leu Pro Pro Pro lie Ser Val Lys 
225 230 

Ala Gly Phe Tyr Arg Arg Phe lie 
245 



lie Val Leu Gly His Lys Val Ser 
205 

Ala Lys Val Glu Val He Glu Lys 
220 

Gly Val Arg Ser Phe Leu Gly His 
235 240 

Lys Asp Phe Thr Lys Val 
250 



<210> 108 
<211> 761 
<212> DNA 

<213> Solanum tuberosum 
<400> 108 

gtgcgtaaag aggttttcaa gctctggatg caggtattgt ctatccaatt tcagacagca 6 0 
agtgggtcag tccagttcag tgtgtgccta aaaagggagg catgacggtg atcactaatg 12 0 
aaaaaaatga gttgattcca accaggacag tgacaggatg gcgaatatgc atggattaca 18 0 
gaaaattaaa tgaagctacc agaaaggatc actacccggt tccttttatt gatcagatgc 24 0 
tggacaggtt ggctggacaa gaatattatt gtttcttgga tggttattca ggatacaacc 300 
aaatagtgat tgcaccagag gaccagggga aaactacatt cacttgcttg tatgggacat 360 
atgtttccaa gagaatgtcg tttgggctat gcaatgctcc atccattttc caaagatgca 420 
tgatggccat cttccatgat aaggttgaag attttatgga aatattcatg gatgacttct 480 
cagtatttgg ggagtctttt gacaggtgct tggagaattt agacagagtg ttggctagat 540 
gcgaggaaac taattttgtc ctaaactggg aaaaatgtca tttcctagtg aaggaaggga 600 
ttgtgttggg tcataaggtg tcaaagagag ggctggaagt tgatcgtgcc agagtggaaa 660 
taatcaaaaa gctacctccc ccaatttctg ttaaaggggt gcgaagtttt ttgggtcatg 720 
ttagtttcta cgaaagattc ataaaggact tcaccaaggt t 761 



<210> 109 
<211> 254 
<212> PRT 

<213> Solanum tuberosum 
<400> 109 

Val Arg Lys Glu Val Phe Lys Leu Leu Asp Ala Gly He Val Tyr Pro 
15 10 15 

He Ser Asp Ser Lys Trp Val Ser Pro Val Gin Cys Val Pro Lys Lys 
20 25 30 
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Gly Gly Met Thr Val He Thr Asn Glu Lys Asn Glu Leu He Pro Thr 
35 40 45 



Arg Thr Val Thr Gly Trp Arg He Cys Met Asp Tyr Arg Lys Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Asp His Tyr Pro Val Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Asp Arg Leu Ala Gly Gin Glu Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Val He Ala Pro Glu Asp Gin Gly Lys Thr 
100 105 110 

Thr Phe Thr Cys Leu Tyr Gly Thr Tyr Val Ser Lys Arg Met Ser Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ser He Phe Gin Arg Cys Met Met Ala He 
130 135 140 

Phe His Asp Lys Val Glu Asp Phe Met Glu He Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Phe Gly Glu Ser Phe Asp Arg Cys Leu Glu Asn Leu Asp Arg 
165 170 175 

Val Leu Ala Arg Cys Glu Glu Thr Asn Phe Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Leu Val Lys Glu Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Lys Arg Gly Leu Glu Val Asp Arg Ala Arg Val Glu He He Lys Lys 
210 215 220 

Leu Pro Pro Pro He Ser Val Lys Gly Val Arg Ser Phe Leu Gly His 
225 230 235 240 

Val Ser Phe Tyr Glu Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 110 
<211> 762 
<212> DNA 
<213> Solanum 



tuberosum 
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<400> 110 

gtgcgtaagg aggtcctcaa gctgtctgat 
aagtggatca gcccagttca ctgtgtgccg 
gaaaagaagg agttgatttc agctagaacg 
aggagactaa atgaggcaac tagaaaggaa 
ttggacaggt ttattgggca agagtattat 
caaattgtga ttgcgccata agataaagag 
tatgccttca agagaatgtc gtttgggccg 
atgacagcca tttttcatga tatggtcaaa 
ttagtctttg gggagtcttt tgacacgtgt 
tgtgaggaaa ctaatcccgt cctcaactgg 
attgtactag gccacaaggt ttcagaggaa 
gtaatttaaa agctaccccc tcaagtcttc 
tctaggttcg aaatgagatt cataaaagac 



gcaggaattg tgtaccccat ttatgatata 60 
aaaaagggag gcatgacgat tattactaat 120 
gtgatagagt ggcacatatg aatggactat 180 
cactacccag ttcctttcat tgatcaaatg 240 
tgtttcctag atggctattc aggatataat 300 
aaaactacat ttacttctct atatgggaca 360 
tgcaatgctc caaccacatt ccaaagatgc 42 0 
tattttgtgg agatattcat ggatgaattc 480 
ctagaatatt tggacaatgt gcttgccaga 54 0 
gaaaaatgtc attttctagt gaagaagggg 60 0 
ggactggaag ttgatcgtgg aaaagtagag 66 0 
gttaaagggg tgagaaggtt ccttggtcat 72 0 
ttcacaaaag tt 762 



<210> 111 
<211> 254 
<212> PRT 

<213> Solanum tuberosum 
<400> 111 

Val Arg Lys Glu Val Leu Lys Leu Ser Asp Ala Gly lie Val Tyr Pro 
15 10 15 

lie Tyr Asp lie Lys Trp lie Ser Pro Val His Cys Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr He He Thr Asn Glu Lys Lys Glu Leu He Ser Ala 
35 40 45 

Arg Thr Val He Glu Trp His He Glx Met Asp Tyr Arg Arg Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Glu His Tyr Pro Val Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Asp Arg Phe He Gly Gin Glu Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Val He Ala Pro Glx Asp Lys Glu Lys Thr 
100 105 110 

Thr Phe Thr Ser Leu Tyr Gly Thr Tyr Ala Phe Lys Arg Met Ser Phe 
115 120 125 

Gly Pro Cys Asn Ala Pro Thr Thr Phe Gin Arg Cys Met Thr Ala He 
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130 



135 



140 



Phe His Asp Met Val Lys Tyr Phe Val Glu He Phe Met Asp Glu Phe 
145 150 155 160 

Leu Val Phe Gly Glu Ser Phe Asp Thr Cys Leu Glu Tyr Leu Asp Asn 
165 170 175 

Val Leu Ala Arg Cys Glu Glu Thr Asn Pro Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Leu Val Lys Lys Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Glu Glu Gly Leu Glu Val Asp Arg Gly Lys Val Glu Val He Glx Lys 
210 215 220 

Leu Pro Pro Gin Val Phe Val Lys Gly Val Arg Arg Phe Leu Gly His 
225 230 235 240 

Ser Arg Phe Glu Met Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 112 
<211> 762 
<212> DNA 

<213> Solanum tuberosum 
<400> 112 

gtgcggaagg aggtttttaa gctgctggat gcgggtattg tataccagat ttcagatagc 60 
aaaggggtct acccgattta gtttgtgcct aaaaaatgca gcatgacagt gatcaccaat 12 0 
gaaaagaatg agctgattcc aaccaggaca gtgacagggt ggcgaatatg catggattat 18 0 
atgaagttga atgaggccac cagaaaggat cactacccga ttcattttat tgatcagatg 240 
ttggacaagt tagctgagta aaaatattat tgtttcttgg cttgttattc aagatacaac 3 00 
caatttctca ttgcaccaca ggaccaggag gaaactacat tcacttgtcc ttatgggaca 360 
tatgctttca agcgaatgtc gtttgggcta tgcaatgctc caaccacctt ccaaagatgc 42 0 
ataagggcta tctttcatga tatggttgaa gattttgtgg agatattcat ggatgacttc 48 0 
tcagtctttg ggtagtcttt tgagaggtgt ctggaaaatt ttgacagggt gctggctgta 540 
tgcgaggaaa ctaatttttt cctaaactgg gaaaaatgtc attttctagt gaaggaaggg 600 
attgtattgg gacataaggt gtcaaagtga aggcttgaag ttgatcgtgc caaagtggaa 660 
gtcgttgaaa acctaccttc cccattctct gttaaagggg tgagaagttt tttgggtcat 72 0 
gctggtttct ataggagatt tatcaaagac ttcactaagg tt 762 

<210> 113 
<211> 254 
<212> PRT 
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<213> Solanum tuberosum 



<400> 113 

Val Arg Lys Glu Val Phe Lys Leu Leu Asp Ala Gly lie Val Tyr Gin 
15 10 15 

lie Ser Asp Ser Lys Gly Val Tyr Pro He Glx Phe Val Pro Lys Lys 
20 25 30 

Cys Ser Met Thr Val He Thr Asn Glu Lys Asn Glu Leu He Pro Thr 
35 40 45 

Arg Thr Val Thr Gly Trp Arg He Cys Met Asp Tyr Met Lys Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Asp His Tyr Pro He His Phe He Asp Gin Met 
65 70 75 80 

Leu Asp Lys Leu Ala Glu Glx Lys Tyr Tyr Cys Phe Leu Ala Cys Tyr 
85 90 95 

Ser Arg Tyr Asn Gin Phe Leu He Ala Pro Gin Asp Gin Glu Glu Thr 
100 105 110 

Thr Phe Thr Cys Pro Tyr Gly Thr Tyr Ala Phe Lys Arg Met Ser Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Thr Thr Phe Gin Arg Cys He Arg Ala He 
130 135 140 

Phe His Asp Met Val Glu Asp Phe Val Glu He Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Phe Gly Glx Ser Phe Glu Arg Cys Leu Glu Asn Phe Asp Arg 
165 170 175 

Val Leu Ala Val Cys Glu Glu Thr Asn Phe Phe Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Leu Val Lys Glu Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Lys Glx Arg Leu Glu Val Asp Arg Ala Lys Val Glu Val Val Glu Asn 
210 215 220 

Leu Pro Ser Pro Phe Ser Val Lys Gly Val Arg Ser Phe Leu Gly His 
225 230 235 240 



105 



Ala Gly Phe Tyr Arg Arg Phe lie Lys Asp Phe Thr Lys Val 
245 250 



<210> 114 
<211> 793 
<212> DNA 

<213> Solanum tuberosum 
<400> 114 

aacttttgtg aagtctttaa tgaaggatgt tgtcagagaa gaagtcatca agtggctgga 6 0 
tacagggatt gtgtacccaa tatctgacaa taaatgggca agtccagtgc agtgtgtgcc 12 0 
taaaaaggga ggaatgacag ttgtgaccaa tgagaaaaat gagttgatcc ccacaagaac 180 
agtaactggg tggaggctat gcatggacta cagaaaactc aatgaagcca ccaggaagga 24 0 
ccactattcg gtaccgttca ttgatcaaat gttagacagg ttggctggcc aagagtatta 300 
ctgtttcctt gatggttatt caaggtataa ttagatcgtc attgcacctg aggatcaaga 360 
gaatacgaca ttcacttgcc catatggcac gtatgcattc aaacgcttgc cattcggctt 42 0 
gtgcaatgcc ccaaccctat ttcagagatg tatgatggca atcttccatg atatggtgga 480 
agattttgtt aaagtataca tggacgattt ctcggtgttt ggtgagtcgt tcgaactttg 540 
tttatctaat cgtgatagag ttcttactag gtgtgaggag accaatttgg tgctgaactg 600 
ggagaagtgt cactttctgg tcagagaagg aattatgttg gggcagaaga tctccaaaag 660 
tgggctagaa gtagacaagg cgaaggtgga agtgattgag aagttgccac caccaatata 720 
agtaaaggga gtgcgaagct tccttggaca tgctggtttt tacaagaggt tcataaagga 780 
cttttcaaag gtt 793 



<210> 115 
<211> 264 
<212> PRT 

<213> Solanum tuberosum 
<400> 115 

Thr Phe Val Lys Ser Leu Met Lys Asp Val Val Arg Glu Glu Val lie 
15 10 15 

Lys Trp Leu Asp Thr Gly lie Val Tyr Pro He Ser Asp Asn Lys Trp 
20 25 30 

Ala Ser Pro Val Gin Cys Val Pro Lys Lys Gly Gly Met Thr Val Val 
35 40 45 

Thr Asn Glu Lys Asn Glu Leu He Pro Thr Arg Thr Val Thr Gly Trp 
50 55 60 

Arg Leu Cys Met Asp Tyr Arg Lys Leu Asn Glu Ala Thr Arg Lys Asp 
65 70 75 80 

His Tyr Ser Val Pro Phe He Asp Gin Met Leu Asp Arg Leu Ala Gly 
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85 



90 



95 



Gin Glu Tyr Tyr Cys Phe Leu Asp Gly Tyr Ser Arg Tyr Asn Glx lie 
100 105 110 

Val lie Ala Pro Glu Asp Gin Glu Asn Thr Thr Phe Thr Cys Pro Tyr 
115 120 125 

Gly Thr Tyr Ala Phe Lys Arg Leu Pro Phe Gly Leu Cys Asn Ala Pro 
130 135 140 

Thr Leu Phe Gin Arg Cys Met Met Ala lie Phe His Asp Met Val Glu 
145 150 155 160 

Asp Phe Val Lys Val Tyr Met Asp Asp Phe Ser Val Phe Gly Glu Ser 
165 170 175 

Phe Glu Leu Cys Leu Ser Asn Arg Asp Arg Val Leu Thr Arg Cys Glu 
180 185 190 

Glu Thr Asn Leu Val Leu Asn Trp Glu Lys Cys His Phe Leu Val Arg 
195 200 205 

Glu Gly lie Met Leu Gly Gin Lys He Ser Lys Ser Gly Leu Glu Val 
210 215 220 

Asp Lys Ala Lys Val Glu Val He Glu Lys Leu Pro Pro Pro He Glx 
225 230 235 240 

Val Lys Gly Val Arg Ser Phe Leu Gly His Ala Gly Phe Tyr Lys Arg 
245 250 255 

Phe He Lys Asp Phe Ser Lys Val 
260 



<210> 116 
<211> 761 
<212> DNA 

<213> Platanus occidentalis 
<400> 116 

gtgcgtaagg aggttttcaa acttcttaaa 
aattgggtca gcccggttca agtggttcct 
tagaatgatg agttggttcc taccagtgtt 
gaaaattgaa tgttgtaacc cgcaaggatc 
ttgaaaggtt agttggtcat tcttactatt 
agattgtaat tactccagag gattaagaaa 



gtttgagtga tttatcctat ttaggatagg 60 
aaaaagattg gaataaccgt tgtgaaaaat 12 0 
cagaatgggt ggagggttgt atagattata 180 
acttcccttt accttttatt gatcaaatgc 240 
gtttcctaga tggttattca agttatttcc 3 00 
agacaacttt tacatgtcca tttgggactt 3 60 
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ttgcatatcg ttgcatgccc tttggccttt 
tggttagcat attttcatat tacattgaga 
tagtttatgg agactccttt aataattttc 
gcatagaaac taaccttgtg ttaaattatg 
tagttttggg tcatgttatt tcatctaaag 
ttattcaatc tttaccttat ctcattagta 
caggtttcta ccgaagattc attaaagact 



gcaatgcccc aaccactttc caaaggtgta 42 0 
atatcataga agtttttatg gatgatttca 480 
tgcataacct tacacttgtt cttcaaagat 540 
aaaaatgtca ttttatggtt gaacaaggta 600 
gaattgaggt agataaagct aaagttgata 660 
tgcggaaagt tcattctttt cttggacatg 720 
ttacaaaggt t 761 



<210> 117 
<211> 254 
<212> PRT 

<213> Platanus occidentalis 
<400> 117 

Val Arg Lys Glu Val Phe Lys Leu Leu Lys Val Glx Val lie Tyr Pro 
15 10 15 

lie Glx Asp Arg Asn Trp Val Ser Pro Val Gin Val Val Pro Lys Lys 
20 25 30 

lie Gly lie Thr Val Val Lys Asn Glx Asn Asp Glu Leu Val Pro Thr 
35 40 45 

Ser Val Gin Asn Gly Trp Arg Val Cys lie Asp Tyr Arg Lys Leu Asn 
50 55 60 

Val Val Thr Arg Lys Asp His Phe Pro Leu Pro Phe lie Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Val Gly His Ser Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Ser Tyr Phe Gin lie Val lie Thr Pro Glu Asp Glx Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Phe Gly Thr Phe Ala Tyr Arg Cys Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Thr Thr Phe Gin Arg Cys Met Val Ser lie 
130 135 140 

Phe Ser Tyr Tyr He Glu Asn He He Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

He Val Tyr Gly Asp Ser Phe Asn Asn Phe Leu His Asn Leu Thr Leu 
165 170 175 
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Val Leu Gin Arg 
180 

Cys His Phe Met 
195 

Ser Lys Gly lie 
210 

Leu Pro Tyr Leu 
225 

Ala Gly Phe Tyr 



Cys lie Glu Thr 



Val Glu Gin Gly 
200 

Glu Val Asp Lys 
215 

lie Ser Met Arg 
230 

Arg Arg Phe lie 
245 



Asn Leu Val Leu 
185 

lie Val Leu Gly 



Ala Lys Val Asp 
220 

Lys Val His Ser 
235 

Lys Asp Phe Thr 
250 



Asn Tyr Glu Lys 
190 

His Val He Ser 
205 

He He Gin Ser 



Phe Leu Gly His 
240 

Lys Val 



<210> 118 
<211> 762 
<212> DNA 

<213> Platanus occidentalis 
<400> 118 

gtgcgtaagg aagttttcaa gcttcttgaa 
aattgggtta gcccagttca agtggctcct 
cagaatgatg agttagttcc tacccatgtt 
agaaaattaa atgttataac ctgcaaggat 
cttgaaaggt tagctggtca ttcttactat 
caaattgcaa ttacttcgga ggatcaagaa 
tttgcatatc gtcacatgcc ctttggcctt 
atggttagca tattttcaga ttacattgag 
acagtttatg gagactcctt tgataattgt 
tgcatagaaa ctaacctagt gttaaattct 
atagttttgg gtcatgttgt ttcatctagg 
attattcaaa ctttacctta ttccactagt 
gtaggttttt actgaagatt cataaaagac 



gttggagtga tttatcttat ttcgaatagc 60 
aaaaagactg gaataaccgt tgtgaaaaat 12 0 
cagaatgggt ggtgggtttg tataaattat 18 0 
cacttccctt taccttttat tgataaaatg 240 
tgtttccttg atggttattt aggttatttt 300 
aagatgattt ttaagtgccc attcgggact 360 
tgcaatgccc caaccacttt ctaaaggtgt 420 
aatatcatag aagtctttat ggatgatttc 480 
ctgcataacc ttacacttgt tattcaaaga 540 
taaaaatgtc attttatggt tgaacaaggt 600 
ggaattgagg tagataaacc taaagttgat 660 
gtgcgagaag ttcgttcttt tcttggacat 72 0 
ttcacaaagg tt 762 



<210> 119 
<211> 254 
<212> PRT 

<213> Platanus occidentalis 
<400> 119 

Val Arg Lys Glu Val Phe Lys Leu Leu Glu Val Gly Val He Tyr Leu 
15 10 15 

He Ser Asn Ser Asn Trp Val Ser Pro Val Gin Val Ala Pro Lys Lys 
20 25 30 
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Thr Gly He Thr Val Val Lys Asn Gin Asn Asp Glu Leu Val Pro Thr 
35 40 45 



His Val Gin Asn Gly Trp Trp Val Cys He Asn Tyr Arg Lys Leu Asn 
50 55 60 

Val He Thr Cys Lys Asp His Phe Pro Leu Pro Phe He Asp Lys Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly His Ser Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Leu Gly Tyr Phe Gin He Ala He Thr Ser Glu Asp Gin Glu Lys Met 
100 105 110 

He Phe Lys Cys Pro Phe Gly Thr Phe Ala Tyr Arg His Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Thr Thr Phe Glx Arg Cys Met Val Ser He 
130 135 140 

Phe Ser Asp Tyr He Glu Asn He He Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Thr Val Tyr Gly Asp Ser Phe Asp Asn Cys Leu His Asn Leu Thr Leu 
165 170 175 

Val He Gin Arg Cys He Glu Thr Asn Leu Val Leu Asn Ser Glx Lys 
180 185 190 

Cys His Phe Met Val Glu Gin Gly He Val Leu Gly His Val Val Ser 
195 200 205 

Ser Arg Gly He Glu Val Asp Lys Pro Lys Val Asp He He Gin Thr 
210 215 220 

Leu Pro Tyr Ser Thr Ser Val Arg Glu Val Arg Ser Phe Leu Gly His 
225 230 235 240 

Val Gly Phe Tyr Glx Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 120 
<211> 759 
<212> DNA 

<213> Platanus occidentalis 
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<400> 120 

gtgcggaaag aggtttttaa gcttttggat gtagggatta tatacccaat tttttatagt 60 
aattaggtaa gtcccactca agtggaccca agaattctgg tgtgactgta gttaaaaatg 120 
caaatgatga attgattcca aatagactca ctattggttg gcgtgtatgc attaactata 180 
agaagttgaa ctcagtgact aggaaggacc atttcccttt accattcatg actaaatcct 24 0 
agaaagggta gctggtcaca aattttatta tttcctatat ggttattcta gatataacta 300 
aatagagatt gcacctgagg actaagaaaa taccactttt acatgtccat ttggcacttt 360 
tgcttatcga aggatgtcat ttggattatg taatgctctt gccacgttct aaagatgcat 42 0 
gttgagtata tttagtgata tggtagaaca ttttcttgag gtgtttatgg attttttttg 480 
tttttggtaa ttcatttgat gattgtttgc ataatttgaa aaaagtgtta aatagatgtg 54 0 
aaggaaaaaa acatcatttt gaattgagag aagtgtcatt tcatggtctc taaaagaatt 60 0 
gtacttggtc acattgtctc ctcccaagga attaaagtgg tcaaagccaa aattgaattg 66 0 
atagtcaatt tgcctagccc aaagactctt aaagacattc gatcttttct aggtcatgca 72 0 
ggatttaaca aaaggttcat caaagacttc acgaaagtt 759 



<210> 121 
<211> 254 
<212> PRT 
<213> Platanus < 

<400> 121 
Val Arg Lys Glu 
1 

lie Phe Tyr Ser 
20 

Ser Gly Val Thr 
35 

Arg Leu Thr lie 
50 

Ser Val Thr Arg 
65 

Leu Glu Arg Val 



Ser Arg Tyr Asn 
100 

Thr Phe Thr Cys 
115 

Gly Leu Cys Asn 



cidentalis 



Val Phe Lys Leu 
5 

Asn Glx Val Ser 



Val Val Lys Asn 
40 

Gly Trp Arg Val 
55 

Lys Asp His Phe 
70 

Ala Gly His Lys 
85 

Glx He Glu He 



Pro Phe Gly Thr 
120 

Ala Leu Ala Thr 



Leu Asp Val Gly 
10 

Pro Thr Gin Val 
25 

Ala Asn Asp Glu 



Cys He Asn Tyr 
60 

Pro Leu Pro Phe 
75 

Phe Tyr Tyr Phe 
90 

Ala Pro Glu Asp 
105 

Phe Ala Tyr Arg 



Phe Glx Arg Cys 



He He Tyr Pro 
15 

Val Pro Lys Asn 
30 

Leu He Pro Asn 
45 

Lys Lys Leu Asn 



Met Asp Glx He 
80 

Leu Tyr Gly Tyr 
95 

Glx Glu Asn Thr 
110 

Arg Met Ser Phe 
125 

Met Leu Ser He 
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130 

Phe Ser Asp Met 
145 

Phe Val Phe Gly 



Val Leu Asn Arg 
180 

Cys His Phe Met 
195 

Ser Gin Gly lie 
210 

Leu Pro Ser Pro 
225 

Ala Gly Phe Asn 



135 

Val Glu His Phe 
150 

Asn Ser Phe Asp 
165 

Cys Glu Glu Lys 



Val Ser Lys Arg 
200 

Lys Val Val Lys 
215 

Lys Thr Leu Lys 
230 

Lys Arg Phe lie 
245 



140 

Leu Glu Val Phe 
155 

Asp Cys Leu His 
170 

Asn lie lie Leu 
185 

lie Val Leu Gly 



Ala Lys lie Glu 
220 

Asp lie Arg Ser 
235 

Lys Asp Phe Thr 
250 



Met Asp Asp Phe 
160 

Asn Leu Lys Lys 
175 

Asn Glx Glu Lys 
190 

His He Val Ser 
205 

Leu He Val Asn 



Phe Leu Gly His 
240 

Lys Val 



<210> 122 
<211> 761 
<212> DNA 

<213> Platanus occidentalis 
<400> 122 

tgcgtaaaga ggtggtcaag cttcttgaag ttggagtgat ttatcctatt tcggatagca 6 0 
attgggttag cccggttcaa gtggttccta aaaagactgg aataaccgtt gtgaaaaatc 12 0 
aaaatgatga gttagttcct acccgtgttc agaatgggtg gcaggtttgt atagattata 18 0 
taaaattaaa tgttgtaacc cgcaaggatc acttcccttt accttttatt gatcaaatgt 24 0 
ttgaaaggtt agctggtcat tcttactatt gtttccttga tggatattca tgttattttt 300 
agattgcaat tactccagag gatcaagaaa agacgacttt tacgtgccca ttcgggactt 360 
tttcatatcg ttgcatgccc tttggccttt gcaacgcccc agccactttc caaaggtgta 42 0 
tggttagcat attttcagat tacattgaga atatcataga agtctttatg gatgatttca 480 
tagtttatga agactccttt gataattgtc tgcataacct tacacttgtt ttttaaagat 54 0 
gcatagaaac taaccttgtg ttaaattttg aaaaatgtca tgttatggtt gaataaggta 60 0 
tagttttggg tcatgttgtt tcatctatgg gaattgaggt agataaagtt aaagttgata 660 
ttattcaatc tttaccttat cccattagtg tgcaggaagt tcgttctttt cttggacatg 72 0 
cgggttttta ccaaagattc attaaagact tcacgaaagt t 761 



<210> 123 
<211> 253 
<212> PRT 



112 



<213> Platanus occidentalis 



<400> 123 

Arg Lys Glu Val Val Lys Leu Leu Glu Val Gly Val He Tyr Pro He 
15 10 15 

Ser Asp Ser Asn Trp Val Ser Pro Val Gin Val Val Pro Lys Lys Thr 
20 25 30 

Gly He Thr Val Val Lys Asn Gin Asn Asp Glu Leu Val Pro Thr Arg 
35 40 45 

Val Gin Asn Gly Trp Gin Val Cys He Asp Tyr He Lys Leu Asn Val 
50 55 60 

Val Thr Arg Lys Asp His Phe Pro Leu Pro Phe He Asp Gin Met Phe 
65 70 75 80 

Glu Arg Leu Ala Gly His Ser Tyr Tyr Cys Phe Leu Asp Gly Tyr Ser 
85 90 95 

Cys Tyr Phe Glx He Ala He Thr Pro Glu Asp Gin Glu Lys Thr Thr 
100 105 110 

Phe Thr Cys Pro Phe Gly Thr Phe Ser Tyr Arg Cys Met Pro Phe Gly 
115 120 125 

Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Val Ser He Phe 
130 135 140 

Ser Asp Tyr He Glu Asn He He Glu Val Phe Met Asp Asp Phe He 
145 150 155 160 

Val Tyr Glu Asp Ser Phe Asp Asn Cys Leu His Asn Leu Thr Leu Val 
165 170 175 

Phe Glx Arg Cys He Glu Thr Asn Leu Val Leu Asn Phe Glu Lys Cys 
180 185 190 

His Val Met Val Glu Glx Gly He Val Leu Gly His Val Val Ser Ser 
195 200 205 

Met Gly He Glu Val Asp Lys Val Lys Val Asp He He Gin Ser Leu 
210 215 220 

Pro Tyr Pro He Ser Val Gin Glu Val Arg Ser Phe Leu Gly His Ala 
225 230 235 240 
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Gly Phe Tyr Gin Arg Phe lie Lys Asp Phe Thr Lys Val 
245 250 



<210> 124 
<211> 761 
<212> DNA 

<213> Sorghum bicolor 
<400> 124 

gtgcgtaaag aggtcttcaa gctctatcat gctgggatta tttatcctgt gccgcatagt 60 
gagtgggtta gccctgttca agtagtgcca aagaaaggag gaatgacggt cgttaggaat 12 0 
gagaagaatg aactcatccc tcaacgaatt gtcactgggt ggcgtatgtg tattgactat 180 
caaaaactca acacggctac aaagaaagat aactttccgt tacccttcat tgatgaaatg 24 0 
ttggaacggc ttgcaaacca ctctttcttc tgtttccttg atggttattc tggatatcac 300 
caaatcccaa tccacccaga tgaccaagaa aagactacct ttacatgccc gtatggaact 3 60 
tatgcataac gacgaatgtc gttcggactg tgcaatgctc cagcttcttt ccaacggtgc 42 0 
atgatgtcta ttttctcgga catgattgag aagatcatgg aggttttcat ggatgatttt 480 
accgtctatg gtaaaacctt cgatcattgt ttggagaatt tagatagagt cttgcagcga 540 
tgtgaagaaa agcacttaat cctgaactgg gagaaatgcc attttatggt tcaggaagga 600 
atagtgctag gacataaagt gtccgaacgt ggtatagagg tggacaaagc aaagattgaa 660 
gttattgaaa aacttccacc tcccacgaat gtgaaaggat ccgtagcttc ttgggacatg 720 
cagggttcta tagatgcttc ataaaagact tcacaaaggt t 761 



<210> 125 
<211> 254 
<212> PRT 

<213> Sorghum bicolor 
<400> 125 

Val Arg Lys Glu Val Phe Lys Leu Tyr His Ala Gly lie lie Tyr Pro 
1 5 10 15 

Val Pro His Ser Glu Trp Val Ser Pro Val Gin Val Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val Val Arg Asn Glu Lys Asn Glu Leu lie Pro Gin 
35 40 45 

Arg lie Val Thr Gly Trp Arg Met Cys lie Asp Tyr Gin Lys Leu Asn 
50 55 60 

Thr Ala Thr Lys Lys Asp Asn Phe Pro Leu Pro Phe lie Asp Glu Met 
65 70 75 80 

Leu Glu Arg Leu Ala Asn His Ser Phe Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 
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Ser Gly Tyr His 
100 

Thr Phe Thr Cys 
115 

Gly Leu Cys Asn 
130 

Phe Ser Asp Met 
145 

Thr Val Tyr Gly 



Val Leu Gin Arg 
180 

Cys His Phe Met 
195 

Glu Arg Gly lie 
210 

Leu Pro Pro Pro 
225 

Ala Gly Phe Tyr 



Gin lie Pro lie 



Pro Tyr Gly Thr 
120 

Ala Pro Ala Ser 
135 

lie Glu Lys He 
150 

Lys Thr Phe Asp 
165 

Cys Glu Glu Lys 



Val Gin Glu Gly 
200 

Glu Val Asp Lys 
215 

Thr Asn Val Lys 
230 

Arg Cys Phe He 
245 



His Pro Asp Asp 
105 

Tyr Ala Glx Arg 



Phe Gin Arg Cys 
14 0 

Met Glu Val Phe 
155 

His Cys Leu Glu 
170 

His Leu He Leu 
185 

He Val Leu Gly 



Ala Lys He Glu 
220 

Gly He Arg Ser 
235 

Lys Asp Phe Thr 
250 



Gin Glu Lys Thr 
110 

Arg Met Ser Phe 
125 

Met Met Ser He 



Met Asp Asp Phe 
160 

Asn Leu Asp Arg 
175 

Asn Trp Glu Lys 
190 

His Lys Val Ser 
205 

Val He Glu Lys 



Phe Leu Gly His 
240 

Lys Val 



<210> 126 
<211> 762 
<212> DNA 

<213> Sorghum bicolor 
<400> 126 

gtgcggaagg aggtccttaa attgctgcat gcagggatta tatatcctgt gccgcacagt 60 

gagtgggtga gcccagtaca agttgtgcct aaaaaaggag gcatgactgt tattataaat 120 

gaaaagaacg agctaattcc gcaacgcacc gtcacaggat ggcagatgtg catagactat 180 

agaaaactaa acaaagccac gagaaaggat cactttcctt taccttttat agatgagatg 240 

ctagagcggt tagcaaacca ttcgttcttc tgtttcttag atggatattc agggtatcat 3 00 

cagatcccga tccatcccga tgatcaaagc aaaaccactt ttacatgccc ttatggaact 360 

tatgcttacc gtagaatgtc ttttgggtta tgtaatgcac cagcttcttt tcaaagatgc 42 0 

atgatgtcta tattttctga tatgattgaa gagattatgg aagttttcat ggatgatttc 480 

tctgtttatg gaaaagcttt tgatagttgt cttgaaaact tagacaaggt tttgcaaagt 540 

tgtgaagaaa agcacttaat ccttaattgg gaaaaatgtc attttatggt tagggaagga 600 



115 



atagtgctag gacacttagt gtctgaaagg ggtattgagg tagacaaagc tgaaattgaa 660 
gtaattgaac aactacctcc acctgtgaat ataaaaggaa ttcgaagctt tcttggccat 72 0 
gctggttttt atcgtagatt catcaaagat ttcacgaaag tt 762 



<210> 127 
<211> 254 
<212> PRT 

<213> Sorghum bicolor 
<400> 127 

Val Arg Lys Glu Val Leu Lys Leu Leu His Ala Gly lie lie Tyr Pro 
15 10 15 

Val Pro His Ser Glu Trp Val Ser Pro Val Gin Val Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val lie lie Asn Glu Lys Asn Glu Leu lie Pro Gin 
35 40 45 

Arg Thr Val Thr Gly Trp Gin Met Cys lie Asp Tyr Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe lie Asp Glu Met 
65 70 75 80 

Leu Glu Arg Leu Ala Asn His Ser Phe Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr His Gin lie Pro lie His Pro Asp Asp Gin Ser Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Tyr Gly Thr Tyr Ala Tyr Arg Arg Met Ser Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Ser Phe Gin Arg Cys Met Met Ser lie 
130 135 140 

Phe Ser Asp Met lie Glu Glu lie Met Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Tyr Gly Lys Ala Phe Asp Ser Cys Leu Glu Asn Leu Asp Lys 
165 170 175 

Val Leu Gin Ser Cys Glu Glu Lys His Leu lie Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Arg Glu Gly lie Val Leu Gly His Leu Val Ser 
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195 



200 



205 



Glu Arg Gly lie Glu Val Asp Lys 
210 215 

Leu Pro Pro Pro Val Asn lie Lys 
225 230 

Ala Gly Phe Tyr Arg Arg Phe lie 
245 



Ala Glu He Glu Val He Glu Gin 
220 

Gly He Arg Ser Phe Leu Gly His 
235 240 

Lys Asp Phe Thr Lys Val 
250 



<210> 128 
<211> 762 
<212> DNA 

<213> Sorghum bicolor 
<400> 128 

gtgcggaagg aagtcttaaa gcttttacac actaggatta tttatctcgt tcctcatagt 6 0 
gagtgggtta gcacggtaca agttgtgcca aagaaaggag gaatgtcggt tgttaggaat 12 0 
gagaagaacg aattcatccc tcaacaaact gtcactgggt ggcgtatgtg cattgactac 180 
caaaaactca acaaggccac aaggaaagat cacttcccgt tacctttcat tgatgaaatg 240 
ttgtaatggc ttacaaatca ctcgttcttt tgtttccttg aagggtattc cagatatcat 3 00 
caaatcccga tccaccacga tgaccaaagt aagactactt tcacatgacc ctatggaact 360 
tacgcatacc gacgaatgtc gttcaggtta tgtaatgctc cagcttcttt tcaacggtgc 420 
atgatgtcta ttttttccaa tatgattgag aaaatcatgg aggtattcac ggatgatttt 480 
accgtatatg gcaaaacctt tgatgattgt ttagagaatt tggacaaagt cttacaattg 540 
tgtgaaggaa a gcacttaat cgtaaactag gagaaatgcc attttatggt ccgagaagga 600 
atagtgctag ggcacaaggt gtccgaacgt gggatagagg tggatagagc caagattgaa 660 
gttattgaaa aacttccacc tcccacaaat gtgaaagaca tccgcagttt tcttggacat 720 
gcagggttct ataggcgctt catcaaagat ttcaccaagg tt 762 



<210> 129 
<211> 254 
<212> PRT 

<213> Sorghum bicolor 
<400> 129 

Val Arg Lys Glu Val Leu Lys Leu Leu His Thr Arg He He Tyr Leu 
15 10 15 

Val Pro His Ser Glu Trp Val Ser Thr Val Gin Val Val Pro Lys Lys 
20 25 30 

Gly Gly Met Ser Val Val Arg Asn Glu Lys Asn Glu Phe He Pro Gin 
35 40 45 
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Gin Thr Val Thr Gly Trp Arg Met Cys He Asp Tyr Gin Lys Leu Asn 
50 55 60 



Lys Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe He Asp Glu Met 
65 70 75 80 

Leu Glx Trp Leu Thr Asn His Ser Phe Phe Cys Phe Leu Glu Gly Tyr 
85 90 95 

Ser Arg Tyr His Gin He Pro He His His Asp Asp Gin Ser Lys Thr 
100 105 110 

Thr Phe Thr Glx Pro Tyr Gly Thr Tyr Ala Tyr Arg Arg Met Ser Phe 
115 120 125 

Arg Leu Cys Asn Ala Pro Ala Ser Phe Gin Arg Cys Met Met Ser He 
130 135 140 

Phe Ser Asn Met He Glu Lys He Met Glu Val Phe Thr Asp Asp Phe 
145 150 155 160 

Thr Val Tyr Gly Lys Thr Phe Asp Asp Cys Leu Glu Asn Leu Asp Lys 
165 170 175 

Val Leu Gin Leu Cys Glu Gly Lys His Leu He Val Asn Glx Glu Lys 
180 185 190 

Cys His Phe Met Val Arg Glu Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Glu Arg Gly He Glu Val Asp Arg Ala Lys He Glu Val He Glu Lys 
210 215 220 

Leu Pro Pro Pro Thr Asn Val Lys Asp He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 130 
<211> 761 
<212> DNA 

<213> Sorghum bicolor 
<400> 130 

gtgcgtaagg aggtttttaa gctgctgcat gcagagatta tatatcatgt gccgcacagt 60 
gagtgggtaa gcccagttca agttgtgcct aaaaagggag gcatgattgt tgttacgaat 12 0 
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gaaaagaacg agctaattcc gcaacgcacc gtcacagggt ggcggatgtg catagactat 18 0 
agaaaactaa acaaagccac gagaaaggat cattttcctt tacctttcat agatgagatg 24 0 
ctagagcgat tagcaaacca ttcgttcttc tgtttcttag atggataatt agggtatcac 3 00 
cagatcccaa tcaatcttga tgatcaaagc aaaaccactt ttccatgccc acatggaact 360 
tatgcttacc gtagaatgtc ttttgggtta tgtaatgcac cagcttcttt tcaaagatgc 42 0 
atgatgtctg tattttctaa tatgattgaa gagattatgg aattttcatg gatgatttct 480 
ctgtttatgg aaaaactttt gatagttgtc ttgaaaactt agacagggtt ttgcaaagat 54 0 
gtgaagaaaa gtacttagtc cttaattgga aaaaatgtca ttttatggtt agggaaggaa 600 
tagtgctggg acacctagtg tctgaaagag gtattgaggt cgacaaagct aaaattgaag 660 
taattgaaca actacctcca cctttgaata taaaaggaat tcgaagcttt cttggccatg 72 0 
ctggttttta tcgtagattc attaaggact ttacaaaggt t 761 



<210> 131 
<211> 254 
<212> PRT 

<213> Sorghum bicolor 



<400> 131 
Val Arg Lys Glu 
1 

Val Pro His Ser 
20 

Gly Gly Met He 
35 



Val Phe Lys Leu 
5 

Glu Trp Val Ser 



Val Val Thr Asn 
40 



Leu His Ala Glu 
10 

Pro Val Gin Val 
25 

Glu Lys Asn Glu 



He He Tyr His 
15 

Val Pro Lys Lys 
30 

Leu He Pro Gin 
45 



Arg Thr Val Thr Gly Trp Arg Met Cys He Asp Tyr Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe He Asp Glu Met 
65 70 75 80 

Leu Glu Arg Leu Ala Asn His Ser Phe Phe Cys Phe Leu Asp Gly Glx 
85 90 95 

Leu Gly Tyr His Gin He Pro He Asn Leu Asp Asp Gin Ser Lys Thr 
100 105 110 



Thr Phe Pro Cys Pro His Gly Thr Tyr Ala Tyr Arg Arg Met Ser Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Ser Phe Gin Arg Cys Met Met Ser Val 
130 135 140 

Phe Ser Asn Met He Glu Glu He Met Glu He Phe Met Asp Asp Phe 
145 150 155 160 
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Ser Val Tyr Gly 



Val Leu Gin Arg 
180 

Cys His Phe Met 
195 

Glu Arg Gly lie 
210 

Leu Pro Pro Pro 
225 

Ala Gly Phe Tyr 



Lys Thr Phe Asp 
165 

Cys Glu Glu Lys 



Val Arg Glu Gly 
200 

Glu Val Asp Lys 
215 

Leu Asn lie Lys 
230 

Arg Arg Phe lie 
245 



Ser Cys Leu Glu 
170 

Tyr Leu Val Leu 
185 

He Val Leu Gly 



Ala Lys He Glu 
220 

Gly He Arg Ser 
235 

Lys Asp Phe Thr 
250 



Asn Leu Asp Arg 
175 

Asn Trp Lys Lys 
190 

His Leu Val Ser 
205 

Val He Glu Gin 



Phe Leu Gly His 
240 

Lys Val 



<210> 132 
<211> 763 
<212> DNA 

<213> Sorghum bicolor 
<400> 132 

gtgcggaaag aggtcgtcaa get ct at cat gctgggatta tttatcctgt gecacatagt 60 
gagtgggtta gccctgttca agtagtgcca aagaaagaag gaatgacggt cgttaggaat 12 0 
gagaagaatg aactcatccc tcaacaaatt gtcactagat ggcgtatgtg tattgactat 18 0 
cgaaaactca acaaagctac aaagaaagat cactttccgt tacccttcat tgatgaaatg 24 0 
ttggaatggc ttgeaaacca ctctttcttc tgtttccttg atggttattc tggatatcac 3 00 
caaatcccaa tccacccaga tgaccaagaa aagactacct ttacatgccc gtattgaact 360 
tatgeatact gacgaatgtc gttcggattg tgeaatgetc tagcttcttt tecageggtg 420 
catgatgtct attttctegg acatgattga gaagatcatg gaggttttca tggatgattt 480 
tacegtctat ggcaaaacct tcgatcattg tttggagaat ttagatagag tettgeageg 540 
atgtgaggaa aatcacttaa tcttgaactg ggagaaatgt cattttatgg ttcaggaagg 600 
aatagtgcta ggacataaag tgtccgaacg tggtatagat gtggacaaag caaagattaa 660 
agttattgaa aaacttccac ctcacacgaa tgtgaaagga atccatagct ttttgggaca 72 0 
tgcagggttc tatagacget tcatcaagga tttcacaaag gtt 763 



<210> 133 
<211> 254 
<212> PRT 

<213> Sorghum bicolor 
<400> 133 

Val Arg Lys Glu Val Val Lys Leu Tyr His Ala Gly He He Tyr Pro 
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1 



5 



10 



15 



Val Pro His Ser Glu Trp Val Ser Pro Val Gin Val Val Pro Lys Lys 
20 25 30 

Glu Gly Met Thr Val Val Arg Asn Glu Lys Asn Glu Leu lie Pro Gin 
35 40 45 

Gin lie Val Thr Arg Trp Arg Met Cys lie Asp Tyr Arg Lys Leu Asn 
50 55 60 

Lys Ala Thr Lys Lys Asp His Phe Pro Leu Pro Phe lie Asp Glu Met 
65 70 75 80 

Leu Glu Trp Leu Ala Asn His Ser Phe Phe Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr His Gin lie Pro lie His Pro Asp Asp Gin Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Tyr Glx Thr Tyr Ala Tyr Glx Arg Met Ser Phe 
115 120 125 

Gly Leu Cys Asn Ala Leu Ala Ser Phe Gin Arg Cys Met Met Ser lie 
130 135 140 

Phe Ser Asp Met lie Glu Lys lie Met Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Thr Val Tyr Gly Lys Thr Phe Asp His Cys Leu Glu Asn Leu Asp Arg 

165 170 175 

Val Leu Gin Arg Cys Glu Glu Asn His Leu lie Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Gin Glu Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Glu Arg Gly He Asp Val Asp Lys Ala Lys He Lys Val He Glu Lys 
210 215 220 

Leu Pro Pro His Thr Asn Val Lys Gly He His Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 
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<210> 134 
<211> 756 
<212> DNA 

<213> Sorghum bicolor 
<400> 134 

aaggaggttt tcaagttgct gcatgcaggg attatatatc ttgtgccgca tagtgagtgg 60 
gtaagcccag ttcaagttgt gcctaaaaag ggaggcatga ctattattat gaatgaaaag 12 0 
aacgagctaa ttccgcaacg caccgttaca gtatggcgga tgtgcataga ctatagaaaa 180 
ctaaacaaag ccacgagaga ggatcacttt cctttacctt tcatagatga gatgctagag 240 
tggttagcaa accattcgtt cttctgtttc ttagatggat attgagggta tcatcagatc 3 00 
ccgatccatc ccgatgatca aagcaaaacc acttttacat gcccatatgg aacttatgct 360 
taccgtagaa tgtcttttgg gttatgtaat gcactagctt cttttcaaag atgcatgatg 420 
tctatatttt ctgatatgat tgaagagatt atggaagttt tcatggatga tttctctgtt 480 
tatggaaaaa cttttgatag ttgtcttaaa aacttagaca aggttttgca aagatgtgaa 540 
gaaaagcact tagtccttaa ttgggaaaaa tgtcatttca tggttaggga aggaatagtg 600 
ctgggacact tagtgtctga aagagctatt gaggtagata aagctaaaat tgaagtaatt 660 
gaacaactac gtccacctgt gaacataaaa ggaatttgaa gctttcttgg ccatgctggt 72 0 
tttcatcgta gattcataaa agactttaca aaggtt 756 



<210> 135 
<211> 252 
<212> PRT 

<213> Sorghum bicolor 
<400> 135 

Lys Glu Val Phe Lys Leu Leu His Ala Gly lie lie Tyr Leu Val Pro 
15 10 15 

His Ser Glu Trp Val Ser Pro Val Gin Val Val Pro Lys Lys Gly Gly 
20 25 30 

Met Thr lie lie Met Asn Glu Lys Asn Glu Leu lie Pro Gin Arg Thr 
35 40 45 

Val Thr Val Trp Arg Met Cys lie Asp Tyr Arg Lys Leu Asn Lys Ala 
50 55 60 

Thr Arg Glu Asp His Phe Pro Leu Pro Phe lie Asp Glu Met Leu Glu 
65 70 75 80 

Trp Leu Ala Asn His Ser Phe Phe Cys Phe Leu Asp Gly Tyr Glx Gly 
85 90 95 

Tyr His Gin lie Pro lie His Pro Asp Asp Gin Ser Lys Thr Thr Phe 
100 105 110 
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Thr Cys Pro Tyr 
115 

Cys Asn Ala Leu 
130 

Asp Met He Glu 
145 

Tyr Gly Lys Thr 



Gin Arg Cys Glu 
180 

Phe Met Val Arg 
195 

Ala He Glu Val 
210 

Pro Pro Val Asn 
225 

Phe His Arg Arg 



Gly Thr Tyr Ala 
120 

Ala Ser Phe Gin 
135 

Glu He Met Glu 
150 

Phe Asp Ser Cys 
165 

Glu Lys His Leu 



Glu Gly He Val 
200 

Asp Lys Ala Lys 
215 

He Lys Gly He 
230 

Phe He Lys Asp 
245 



Tyr Arg Arg Met 



Arg Cys Met Met 
140 

Val Phe Met Asp 
155 

Leu Lys Asn Leu 
170 

Val Leu Asn Trp 
185 

Leu Gly His Leu 



He Glu Val He 
220 

Glx Ser Phe Leu 
235 

Phe Thr Lys Val 
250 



Ser Phe Gly Leu 
125 

Ser He Phe Ser 



Asp Phe Ser Val 
160 

Asp Lys Val Leu 
175 

Glu Lys Cys His 
190 

Val Ser Glu Arg 
205 

Glu Gin Leu Arg 



Gly His Ala Gly 
240 



<210> 136 
<211> 762 
<212> DNA 
<213> Glycine max 

<400> 136 

gtgcgtaagg aggttgtcaa gcttttggag gttgggctca tatacctcat ctctgacagc 60 
gcttgggtaa gcctagtaca ggtggctccc aagaaatgcg gaatgacagt ggtacaaaat 12 0 
gagaggaatg acttgatacc aacacgaact gtcactggct agcggatgtg tatcgactac 180 
tgcaagttga atgaagccac acggaaggac catttcccct tacctttcat ggatcagatg 240 
ctggagaggc ttgcagggca ggcatactac tgtttcttgg atagatattc aggatacaac 300 
caaatcgcgg tagaccccag agatcaggag aagatggcct ttacatgccc ctttggcgtc 360 
tttgcttaca gaaggatgtc attcaggtta tgtaacgcac cagccacatt tcagaggtgc 42 0 
gtgctggcca ttttttcaga catggtggag aagagcatcg aggtatttat ggatgaattc 480 
tcgatttttg gacccttatt tgacagttgc ttaaggaact tagagatggt actacagagg 540 
tgcgtataga ctaacttggt actaaattag gaaaaatgtc atttcatggt tcgagaggga 600 
atagtgatgg accacaatat ctcagctaga gggattgagg ttgatcaggc aaagatagac 660 
gtcattgaga agttgccacc accactgaat gttaaaggcg tcagaagttt cttagggcat 72 0 
gcaggtttct acaggaggtt tatcaaggac ttcaccaagg tt 762 
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<210> 137 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 137 

Val Arg Lys Glu Val Val Lys Leu Leu Glu Val Gly Leu lie Tyr Leu 
15 10 15 

He Ser Asp Ser Ala Trp Val Ser Leu Val Gin Val Ala Pro Lys Lys 
20 25 30 

Cys Gly Met Thr Val Val Gin Asn Glu Arg Asn Asp Leu He Pro Thr 
35 40 45 

Arg Thr Val Thr Gly Glx Arg Met Cys He Asp Tyr Cys Lys Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly Gin Ala Tyr Tyr Cys Phe Leu Asp Arg Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Ala Val Asp Pro Arg Asp Gin Glu Lys Met 
100 105 110 

Ala Phe Thr Cys Pro Phe Gly Val Phe Ala Tyr Arg Arg Met Ser Phe 
115 120 125 

Arg Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Val Leu Ala He 
130 135 140 

Phe Ser Asp Met Val Glu Lys Ser He Glu Val Phe Met Asp Glu Phe 
145 150 155 160 

Ser He Phe Gly Pro Leu Phe Asp Ser Cys Leu Arg Asn Leu Glu Met 
165 170 175 

Val Leu Gin Arg Cys Val Glx Thr Asn Leu Val Leu Asn Glx Glu Lys 
180 185 190 

Cys His Phe Met Val Arg Glu Gly He Val Met Asp His Asn He Ser 
195 200 205 

Ala Arg Gly He Glu Val Asp Gin Ala Lys He Asp Val He Glu Lys 
210 215 220 
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Leu Pro Pro Pro Leu Asn Val Lys Gly Val Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 138 
<211> 763 
<212> DNA 
<213> Glycine max 

<400> 138 

gtgcgtaagg aggtctttaa gttcttggag 
acttaggtaa gcccagtaca ggtggttccc 
gagaagaatg acttgatacc aacacgaact 
cgcaagctga atgaggccac ccggaaggac 
ttggagagac ttgcagggca ggcgtattat 
cagattgcgg tggaccctag agaccaagag 
ctttgcttac agaaggatgc cattcgggtt 
catgctggcc attttttcag acatggtgga 
ttcagttttt gggccctcat ttgacagttg 
gtgcgtagag actaatttag tgctgaactg 
catagtcctg agccacaaga tctcagctag 
cgtcatagag aagctgccac caccattgaa 
tgcaggattc tacaggagat tcataaagga 



gctgggctca tatatcccat ctctaatagc 60 
aagaaaggtg gaatgacagt agtacagaat 12 0 
gtcactagct ggcgaatatg catcgattat 180 
cacttccctc tacctttcat ggatcagatg 240 
tgtttcttgg atggatactc gagatataat 3 00 
aagacgacct tcacatgccc tttttggcgt 360 
atgtaatgca ccagccacat ttcagaggtg 420 
gaaaaatatc gaggtattca tggatgactt 480 
tttgaggaac ctagagatgg tactttagag 540 
ggagaagtgt cattttatgg ttcgagaggg 600 
agggattgag gttgaccggg caaagataga 660 
tattaaaggt gtcagaagtt tcttagggca 72 0 
ctttacaaag gtt 763 



<210> 139 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 139 

Val Arg Lys Glu Val Phe Lys Phe Leu Glu Ala Gly Leu He Tyr Pro 
15 10 15 

He Ser Asn Ser Thr Glx Val Ser Pro Val Gin Val Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val Val Gin Asn Glu Lys Asn Asp Leu He Pro Thr 
35 40 45 

Arg Thr Val Thr Ser Trp Arg He Cys He Asp Tyr Arg Lys Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe Met Asp Gin Met 
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65 



70 



75 



80 



Leu Glu Arg Leu Ala Gly Gin Ala Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Arg Tyr Asn Gin lie Ala Val Asp Pro Arg Asp Gin Glu Lys Thr 
100 105 110 

Thr Phe Thr Cys Pro Phe Gly Val Phe Ala Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Leu Ala lie 
130 135 140 

Phe Ser Asp Met Val Glu Lys Asn lie Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Phe Gly Pro Ser Phe Asp Ser Cys Leu Arg Asn Leu Glu Met 
165 170 175 

Val Leu Glx Arg Cys Val Glu Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Arg Glu Gly lie Val Leu Ser His Lys lie Ser 
195 200 205 

Ala Arg Gly He Glu Val Asp Arg Ala Lys He Asp Val He Glu Lys 
210 215 220 

Leu Pro Pro Pro Leu Asn He Lys Gly Val Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 140 
<211> 762 
<212> DNA 
<213> Glycine max 

<400> 140 

gtgcgcaagg aggttttgaa gcttctagag 
gcttgggtaa gcccagtctt ggtggtgtcg 
gaaaagaatg acctgatacc aacacgaact 
cgcaagctca acgaagccac aaggaaagac 
ttggagagac ttgcaggaca cgcttattat 
cagattgttg tagaccccaa ggatcaggag 



gttgggctta tctaccccat ctccgacagc 60 
aagaaagagg gcatgacagt cattcgaaat 12 0 
gtcactagtt ggaaattatg catcgattac 18 0 
catttccctc tacccttcat ggatcagatg 240 
tgcttcttgg atgcatactt tggatataat 300 
aagatggcct tcacatgccc ttttggtgtc 360 
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tttgcctata gacggattcc 
atgttggcca tttttgcaga 
tcagtatttg tgccctcatt 
tgcgtggaaa caaacttagt 
atagtcttag gccataaaat 
gtcattgaaa agttgccacc 
gccaggttct acagaagatt 



atttgggttg tgcaatgcac 
tatagtggag aaaagcatcg 
agaaagttgt ttgaagaagt 
actaaattgg gagaagtgtc 
ttcgacccga ggaattgagg 
accatcaaat gttaaaggca 
catcaaggac ttcacaaaag 



ctaccacatt ccaaatgtgc 42 0 
aagtattcat ggatgacttt 480 
tggagatggt actacaaaga 54 0 
acttcatggt tcgagaaggc 600 
tagaccaaac aaagattgat 660 
tcaggagctt cctaggacaa 720 
tt 762 



<210> 141 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 141 

Val Arg Lys Glu Val Leu Lys Leu Leu Glu Val Gly Leu lie Tyr Pro 
15 10 15 

lie Ser Asp Ser Ala Trp Val Ser Pro Val Leu Val Val Ser Lys Lys 
20 25 30 

Glu Gly Met Thr Val lie Arg Asia Glu Lys Asn Asp Leu lie Pro Thr 
35 40 45 

Arg Thr Val Thr Ser Trp Lys Leu Cys lie Asp Tyr Arg Lys Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly His Ala Tyr Tyr Cys Phe Leu Asp Ala Tyr 
85 90 95 

Phe Gly Tyr Asn Gin lie Val Val Asp Pro Lys Asp Gin Glu Lys Met 
100 105 110 

Ala Phe Thr Cys Pro Phe Gly Val Phe Ala Tyr Arg Arg lie Pro Phe 

115 120 125 

Gly Leu Cys Asn Ala Pro Thr Thr Phe Gin Met Cys Met Leu Ala lie 
130 135 140 

Phe Ala Asp lie Val Glu Lys Ser lie Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Phe Val Pro Ser Leu Glu Ser Cys Leu Lys Lys Leu Glu Met 
165 170 175 
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Val Leu Gin Arg Cys Val Glu Thr 
180 

Cys His Phe Met Val Arg Glu Gly 
195 200 

Thr Arg Gly lie Glu Val Asp Gin 
210 215 

Leu Pro Pro Pro Ser Asn Val Lys 
225 230 

Ala Arg Phe Tyr Arg Arg Phe lie 
245 



Asn Leu Val Leu Asn Trp Glu Lys 
185 190 

lie Val Leu Gly His Lys lie Ser 
205 

Thr Lys lie Asp Val lie Glu Lys 
220 

Gly lie Arg Ser Phe Leu Gly Gin 
235 240 

Lys Asp Phe Thr Lys Val 
250 



<210> 142 
<211> 762 
<212> DNA 
<213> Glycine max 

<400> 142 

gtgcggaagg aggttattaa gttgctagag gcagggctca tttacctaat ctcagatagt 60 
tcataggtta gtcctgttca tgttgctctg aaaaagggag gtatgacagt gataaagaat 120 
gatagagatg agttaattcc tacaagaata gttactggat ggaggatggg tattgattac 180 
aagaagctaa atgaagccac caggaaagac cattacccgc ttcccttcat ggatcaaatg 24 0 
cttgagagac ttgcagggca atcttcctac tatttattag atggatactc gggctacaat 3 00 
caaattgcag tggatcctca ggaccaagaa aagacagctt tcacatgtcc ttttggtgta 360 
tttgcttatc gccgcatgtc gttcggttta tgtaatgccc caactacttt ccagagatgt 42 0 
atgatggcaa tttttgctga catggtaaag aaatgtattg aagtttttat ggacgatttc 48 0 
tctgtctttg gtgcatcttt tgaaaattgc ctagcaaatt tagagaaagt gttacaacgc 54 0 
tatgaagaat ctaatttggt gctcaactgg gaaaaatgtc actttatggt tcaagaaggt 600 
atcatgctgg gacacaagat ttctagaaga ggaattaagg tggataaggc aaagattgag 660 
gttattgata aacttccacc tctagttaat gttagaggca tacgaagttt tttgggtcat 72 0 
gctagattct atcgatgatt tatcaaggac ttcaccaaag tt 762 



<210> 143 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 143 

Val Arg Lys Glu Val lie Lys Leu Leu Glu Ala Gly Leu lie Tyr Leu 
15 10 15 

lie Ser Asp Ser Ser Glx Val Ser Pro Val His Val Ala Leu Lys Lys 
20 25 30 
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Gly Gly Met Thr Val lie Lys Asn Asp Arg Asp Glu Leu lie Pro Thr 
35 40 45 



Arg lie Val Thr Gly Trp Arg Met Gly lie Asp Tyr Lys Lys Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Asp His Tyr Pro Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly Gin Ser Ser Tyr Tyr Leu Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin lie Ala Val Asp Pro Gin Asp Gin Glu Lys Thr 
100 105 110 

Ala Phe Thr Cys Pro Phe Gly Val Phe Ala Tyr Arg Arg Met Ser Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Thr Thr Phe Gin Arg Cys Met Met Ala lie 
130 135 140 

Phe Ala Asp Met Val Lys Lys Cys lie Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Phe Gly Ala Ser Phe Glu Asn Cys Leu Ala Asn Leu Glu Lys 
165 170 175 

Val Leu Gin Arg Tyr Glu Glu Ser Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Gin Glu Gly lie Met Leu Gly His Lys lie Ser 
195 200 205 

Arg Arg Gly lie Lys Val Asp Lys Ala Lys lie Glu Val He Asp Lys 
210 215 220 

Leu Pro Pro Leu Val Asn Val Arg Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Arg Phe Tyr Arg Glx Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 144 

<211> 761 

<212> DNA 

<213> Glycine max 
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<400> 144 

gtgcggaagg aggtctttaa gttgctggaa 
gcatgggtta gccctatgca agttgtccct 
gataaagatg agttgatatc cacaaggacc 
cgaaagctga atgatgcacc cggaaggacc 
ttgaaagact tgttgggcaa tcctattatt 
agattgttgt agatcccaaa gatcaagaga 
tcgcatatca gtgcatgcct tttggtctat 
tgatggctat tttttctgat atggtggaaa 
ctatttttgg gccatccttt gaagggtgct 
gtgaagagtc caatctagtt ctcaattgga 
taatgttggg gcataaaatt tcagtaagag 
taattgagaa actacttgct cccatgaatg 
cagggttcta caggcgattc ataaaagact 



gcaggcctta tttatcccat ttcggatagt 60 
aagaaaggag gtatgacagt cattaagaat 120 
gtcaccgggt ggagaatgtg cattgactat 18 0 
attatccact ccctttcatg ggccatatgc 240 
gttttctaga tggatattat ggttataatc 300 
agacagcttt cacctaccct tttggtgtat 360 
gcaatgcccc agctacattt cagaggtgta 420 
tatgcattga agttttcatg gacgatttct 480 
tatcaaatct tgaaaaagta ttaaagagat 54 0 
agaaatgcca tttcatggtt caagaaggaa 600 
ggatagaggt ggacaaggca aagattgatg 660 
tcaagggaat aagaagcttc ttaggacatg 72 0 
tcaccaaagt t 761 



<210> 145 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 145 

Val Arg Lys Glu Val Phe Lys Leu Leu Glu Ala Gly Leu lie Tyr Pro 
15 10 15 

lie Ser Asp Ser Ala Trp Val Ser Pro Met Gin Val Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val lie Lys Asn Asp Lys Asp Glu Leu lie Ser Thr 
35 40 45 

Arg Thr Val Thr Gly Trp Arg Met Cys lie Asp Tyr Arg Lys Leu Asn 
50 55 60 

Asp Ala Thr Arg Lys Asp His Tyr Pro Leu Pro Phe Met Gly His Met 
65 70 75 80 

Leu Glu Arg Leu Val Gly Gin Ser Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Tyr Gly Tyr Asn Gin He Val Val Asp Pro Lys Asp Gin Glu Lys Thr 
100 105 110 

Ala Phe Thr Tyr Pro Phe Gly Val Phe Ala Tyr Gin Cys Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Met Ala He 
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130 

Phe Ser Asp Met 
145 

Ser lie Phe Gly 



Val Leu Lys Arg 
180 

Cys His Phe Met 
195 

Val Arg Gly He 
210 

Leu Leu Ala Pro 
225 

Ala Gly Phe Tyr 



135 

Val Glu He Cys 
150 

Pro Ser Phe Glu 
165 

Cys Glu Glu Ser 



Val Gin Glu Gly 
200 

Glu Val Asp Lys 
215 

Met Asn Val Lys 
230 

Arg Arg Phe He 
245 



14 0 

He Glu Val Phe 
155 

Gly Cys Leu Ser 
170 

Asn Leu Val Leu 
185 

He Met Leu Gly 



Ala Lys He Asp 
220 

Gly He Arg Ser 
235 

Lys Asp Phe Thr 
250 



Met Asp Asp Phe 
160 

Asn Leu Glu Lys 
175 

Asn Trp Lys Lys 
190 

His Lys He Ser 
205 

Val He Glu Lys 



Phe Leu Gly His 
240 

Lys Val 



<210> 146 
<211> 762 
<212> DNA 
<213> Glycine max 

<400> 146 

gtgcgtaagg aggtggtcaa gttgcttgaa gtaggactaa tttatccaat ctctgatagt 60 
gcttgggtga gttcgaacta ggtggtgcct aagaaaggtg gtatgacggt gatccacaat 120 
gataagaatg atcttattcc tacacagaca atcattaggt ggcaaatgtg tattgactat 180 
cacaagttga atgatgtcac caagaaggac cattttcctc tgccattcat ggaccaaatg 24 0 
ttagagaggt tagctggcca agctttttat tgttttttgg atggttattc tgggtataac 300 
caaatagcgg tgcatcttaa agatcaagag aagactacta tcatatgccc atttggtgtc 360 
tttgcttaca gacaaatgtc atttgaactg tgtaatgccc ctaccacctt ctagagattc 420 
atgatggcca tttttgctga ccttgtggag aaatgcatag aggtgttcat gaatgatttc 480 
tctattttcg gctcttcctt ttatcattgt ttatccaacc tggaattagt gttacaacgg 540 
tgtgcggaaa ccaatttgtt gatgaactgg gagaaatgtc atttcatggt ccaagagggg 600 
attgtcttag gccacaagat ctcttccaga gggttggaag tggacaaggc aaaaattgat 660 
gttattgaga agttgcctcc acctatgaat gtgaaaggca tccgaagttt tctcgaatat 72 0 
gttggatttt ataggaggtt catcaaagac ttcacgaaag tt 762 



<210> 147 
<211> 254 
<212> PRT 
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<213> Glycine max 



<400> 147 

Val Arg Lys Glu Val Val Lys Leu Leu Glu Val Gly Leu lie Tyr Pro 
15 10 15 

lie Ser Asp Ser Ala Trp Val Ser Ser Asn Glx Val Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val He His Asn Asp Lys Asn Asp Leu He Pro Thr 
35 40 45 

Gin Thr He He Arg Trp Gin Met Cys He Asp Tyr His Lys Leu Asn 
50 55 60 

Asp Val Thr Lys Lys Asp His Phe Pro Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly Gin Ala Phe Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Ala Val His Leu Lys Asp Gin Glu Lys Thr 
100 105 110 

Thr He He Cys Pro Phe Gly Val Phe Ala Tyr Arg Gin Met Ser Phe 
115 120 125 

Glu Leu Cys Asn Ala Pro Thr Thr Phe Glx Arg Phe Met Met Ala He 
130 135 140 

Phe Ala Asp Leu Val Glu Lys Cys He Glu Val Phe Met Asn Asp Phe 
145 150 155 160 

Ser He Phe Gly Ser Ser Phe Tyr His Cys Leu Ser Asn Leu Glu Leu 
165 170 175 

Val Leu Gin Arg Cys Ala Glu Thr Asn Leu Leu Met Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Gin Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

Ser Arg Gly Leu Glu Val Asp Lys Ala Lys He Asp Val He Glu Lys 
210 215 220 

Leu Pro Pro Pro Met Asn Val Lys Gly He Arg Ser Phe Leu Glu Tyr 
225 230 235 240 
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Val Gly Phe Tyr Arg Arg Phe lie Lys Asp Phe Thr Lys Val 
245 250 



<210> 148 
<211> 762 
<212> DNA 
<213> Glycine max 



<400> 148 

gtgcgtaagg aggttctcaa gcttttggag 
gcttgggtaa gcctagtaca ggtggctccc 
gagaggaatg acttgatacc aacacgaact 
tgcaagttga atgaagccac acggaaggac 
ctggagaggc ttgcagggca ggcatactac 
caaatcgcgg tagaccccag agatcaggag 
tttgcttaca gaaggatgtc attcaggtta 
atgctggcca ttttttcaga catggtggag 
tcgatttttg gacccttatt tgacagttgc 
tgcgtataga ctaacttggt actaaattag 
atagtgatgg gccacaatat ctcagctaga 
gtcattgaga agttgccacc accactgaat 
gcaggtttct acaggaggtt cataaaagac 



gttgggctca tatacctcat ctctgacagc 60 
aagaaatgcg gaatgacagt ggtacaaaat 12 0 
gtcactggct agcggatgtg tatcgactac 180 
catttcccct tacctttcat ggatcagatg 240 
tgtttcttgg atagatattc aggatacaac 300 
aagatggcct ttacatgccc ctttggcgtc 360 
tgtaacgcac cagccacatt tcagaggtgc 42 0 
aagagcatcg aggtatttat ggatgaattc 480 
ttaaggaact tagagatggt actacagagg 54 0 
gaaaaatgtc atttcatggt tcgagaggga 60 0 
gggattgagg ttgatcagac aaagatagac 660 
gttaaaggcg tcagaagttt cttagggcat 72 0 
ttcacaaagg tt 762 



<210> 149 

<211> 254 

<212> PRT 

<213> Glycine max 



<400> 149 
Val Arg Lys Glu 
1 

lie Ser Asp Ser 
20 

Cys Gly Met Thr 
35 



Val Leu Lys Leu 
5 

Ala Trp Val Ser 



Val Val Gin Asn 
40 



Leu Glu Val Gly 
10 

Leu Val Gin Val 
25 

Glu Arg Asn Asp 



Leu lie Tyr Leu 
15 

Ala Pro Lys Lys 
30 

Leu lie Pro Thr 
45 



Arg Thr Val Thr Gly Glx Arg Met Cys lie Asp Tyr Cys Lys Leu Asn 
50 55 60 

Glu Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly Gin Ala Tyr Tyr Cys Phe Leu Asp Arg Tyr 
85 90 95 
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Ser Gly Tyr Asn 
100 

Ala Phe Thr Cys 
115 

Arg Leu Cys Asn 
130 

Phe Ser Asp Met 
145 

Ser lie Phe Gly 



Val Leu Gin Arg 
180 

Cys His Phe Met 
195 

Ala Arg Gly lie 
210 

Leu Pro Pro Pro 
225 

Ala Gly Phe Tyr 



Gin He Ala Val 



Pro Phe Gly Val 
120 

Ala Pro Ala Thr 
135 

Val Glu Lys Ser 
150 

Pro Leu Phe Asp 
165 

Cys Val Glx Thr 



Val Arg Glu Gly 
200 

Glu Val Asp Gin 
215 

Leu Asn Val Lys 
230 

Arg Arg Phe He 
245 



Asp Pro Arg Asp 
105 

Phe Ala Tyr Arg 



Phe Gin Arg Cys 
140 

He Glu Val Phe 
155 

Ser Cys Leu Arg 
170 

Asn Leu Val Leu 
185 

He Val Met Gly 



Thr Lys He Asp 
220 

Gly Val Arg Ser 
235 

Lys Asp Phe Thr 
250 



Gin Glu Lys Met 
110 

Arg Met Ser Phe 
125 

Met Leu Ala He 



Met Asp Glu Phe 
160 

Asn Leu Glu Met 
175 

Asn Glx Glu Lys 
190 

His Asn He Ser 
205 

Val He Glu Lys 



Phe Leu Gly His 
240 

Lys Val 



<210> 150 
<211> 761 
<212> DNA 
<213> Glycine max 

<400> 150 

gtgcgtaagg aggtttttaa gttgctggaa 
gcatgggtta gccctgtgca ggttgtcccc 
gaaaaggatg agttgatatc cacaaggact 
cagaagctga atgatgccac ccggaaggac 
cttgaaagac ttgccgggca atcttattat 
cagattgatg tagatcccaa ggatcaagag 
ttcgcctatc ggcgcatgcc ctttggtttg 
atgatgacta ttttttctga tatggtggaa 
tctatttttg ggccatcttt tgaagggtgc 
cgtgaagagt ccaaactagt tctcaattgg 



gcaggtctta tttatcccat ttcggatagt 60 
aagaaagaag gtaagacagt cattaaggat 12 0 
atcaccgggt ggagaatgtg cattgactat 180 
cattatccac tccctttcat ggaccaaatg 240 
tgttttctgg atggatattc tggttataat 300 
aagactgctt tcacctaccc ttttggtgta 360 
tgcaatgccc cagctacatt tcagaggtgt 42 0 
aaatgaattg aagttttcat ggacgatttc 480 
ttatcaaatc ttgaaagagt attaaagaga 54 0 
gagaaatgcc atttcatggt tcaagaagga 6 00 
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atagtgtggg gcataaaatt tcagtaagag ggatagaggt ggacaaggca aagattgatg 660 
taatagagaa actacctcct cccatgaatg tcaagggaat aagaagcttc ctaggacatg 72 0 
cagggttcta caagcgattc atcaaagatt tcacaaaggt t 761 



<210> 151 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 151 

Val Arg Lys Glu Val Phe Lys Leu Leu Glu Ala Gly Leu lie Tyr Pro 
15 10 15 

He Ser Asp Ser Ala Trp Val Ser Pro Val Gin Val Val Pro Lys Lys 
20 25 30 

Glu Gly Lys Thr Val He Lys Asp Glu Lys Asp Glu Leu He Ser Thr 
35 40 45 

Arg Thr He Thr Gly Trp Arg Met Cys He Asp Tyr Gin Lys Leu Asn 
50 55 60 

Asp Ala Thr Arg Lys Asp His Tyr Pro Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly Gin Ser Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Asp Val Asp Pro Lys Asp Gin Glu Lys Thr 
100 105 110 

Ala Phe Thr Tyr Pro Phe Gly Val Phe Ala Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Met Thr He 
130 135 140 

Phe Ser Asp Met Val Glu Lys Glx He Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser He Phe Gly Pro Ser Phe Glu Gly Cys Leu Ser Asn Leu Glu Arg 
165 170 175 

Val Leu Lys Arg Arg Glu Glu Ser Lys Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Gin Glu Gly He Val Leu Gly His Lys He Ser 



135 



195 



200 



205 



Val Arg Gly lie Glu Val Asp Lys Ala Lys lie Asp Val He Glu Lys 
210 215 220 

Leu Pro Pro Pro Met Asn Val Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Lys Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 152 
<211> 762 
<212> DNA 
<213> Glycine max 

<400> 152 

gtgcggaaag aggtattcaa gttactagag gcagggctca tctacccaat ttcagatagc 60 
tcctgggtta gtccggttca agttgttcca aaaaaaggag ggatgacagt ggtaaaaaat 12 0 
gatagaaatg agctaattcc tacaagaaga gtcaccagat ggagaatgtg tattgattat 18 0 
aggaagctca atgaagccac aagaaaagac cattacccac ttcccttcat ggatcaaatg 240 
cttaagagac ttgcaaggca atccttctac cgtttcttgg acggatactc aggttacaat 3 00 
cagattgcag tggatcctca ggatcaagaa aaaacagctt ttacatgtcc tttcagtgtt 360 
tttgcttatc gccgcatgcc gttcggttta tgtaatgcct ctactacttt tcagagatgt 42 0 
atgatggcaa tttttgatga catggtagag aaatgtattg aagtctttat ggatgatttt 480 
tcgttctttg gtgcatcttt tggaaattgc ttagcaaatt tagagaaagt gttacaacgt 540 
tgtgaaaaat ctaatttggt gcttaactgg gaaaaatgtc actttatggt acaagaaggt 600 
attgtgctag gacacaaaat ctctaaaaga ggaattgagg tggttaaaga aaaactagat 660 
gttattgata aacttccacc cccagttaat gtaaaaggca tacacagttt tttgggtcat 72 0 
gttggatttt atcggcgatt cataaaggac ttcaccaaag tt 762 



<210> 153 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 153 

Val Arg Lys Glu Val Phe Lys Leu Leu Glu Ala Gly Leu He Tyr Pro 
15 10 15 

He Ser Asp Ser Ser Trp Val Ser Pro Val Gin Val Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val Val Lys Asn Asp Arg Asn Glu Leu He Pro Thr 
35 40 45 
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Arg Arg Val Thr Arg Trp Arg Met Cys He Asp Tyr Arg Lys Leu Asn 
50 55 60 



Glu Ala Thr Arg Lys Asp His Tyr Pro Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Lys Arg Leu Ala Arg Gin Ser Phe Tyr Arg Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Ala Val Asp Pro Gin Asp Gin Glu Lys Thr 
100 105 110 

Ala Phe Thr Cys Pro Phe Ser Val Phe Ala Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Ser Thr Thr Phe Gin Arg Cys Met Met Ala He 
130 135 140 

Phe Asp Asp Met Val Glu Lys Cys He Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Phe Phe Gly Ala Ser Phe Gly Asn Cys Leu Ala Asn Leu Glu Lys 
165 170 175 

Val Leu Gin Arg Cys Glu Lys Ser Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Gin Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

Lys Arg Gly He Glu Val Val Lys Glu Lys Leu Asp Val He Asp Lys 
210 215 220 

Leu Pro Pro Pro Val Asn Val Lys Gly He His Ser Phe Leu Gly His 
225 230 235 240 

Val Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Thr Lys Val 
245 250 



<210> 154 
<211> 761 
<212> DNA 
<213> Glycine max 

<400> 154 

gtgcgtaaag aagttttgaa gctgctagaa gcagacctta tttatcccat ttcggatagt 60 
acatgggtta gccctgtgca agttgtcccc gagaaaggag gtatgacagt cattaagaat 120 
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gataaagatg agttgatatc cacaaggact gtcaccgggt gagaatgtgc attgactatc 180 
ggaagctgaa tgatgccacc cagaaggacc attattcact ccctttcatg gaccagatgc 240 
ttgaaagact tgccggacaa tcctattatt gttttctgaa tggatactct ggctataatc 3 00 
agattgtggt agatcccaaa gatcaggaga aaactgcttt cacctgcctt tttggtgtat 360 
ttgcatacaa gcgtatgcat tttggcttgt gtaatgctcc aactacgtgt cagaggtgta 420 
tgatgactat tttttctggt atcgtggaaa aatgcattga acttttcatg gacgatttct 480 
ctatttttgg gccatctttt gaaggctact tatcaaacct tgaaagagta ttacagagat 54 0 
gtgaagagtc taatctagtt ctcaattggg agaaatgcca tttcatggtt caagaaggaa 60 0 
tagtgctggg gcataaaatt tcagtaagag ggatagaggt ggacaaggca aagattgatg 660 
taattgagaa actacctcct cccatgattg tcaagggaat aagaagcctc ctaggacatg 72 0 
tagggttcta caggcgattc atcaaagact tcacaaaggt t 761 



<210> 155 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 155 

Val Arg Lys Glu Val Leu Lys Leu Leu Glu Ala Asp Leu lie Tyr Pro 
15 10 15 

lie Ser Asp Ser Thr Trp Val Ser Pro Val Gin Val Val Pro Glu Lys 
20 25 30 

Gly Gly Met Thr Val lie Lys Asn Asp Lys Asp Glu Leu He Ser Thr 
35 40 45 

Arg Thr Val Thr Gly Trp Arg Met Cys He Asp Tyr Arg Lys Leu Asn 
50 55 60 

Asp Ala Thr Gin Lys Asp His Tyr Ser Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly Gin Ser Tyr Tyr Cys Phe Leu Asn Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Val Val Asp Pro Lys Asp Gin Glu Lys Thr 
100 105 110 

Ala Phe Thr Cys Leu Phe Gly Val Phe Ala Tyr Lys Arg Met His Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Thr Thr Cys Gin Arg Cys Met Met Thr He 
130 135 140 

Phe Ser Gly He Val Glu Lys Cys He Glu Leu Phe Met Asp Asp Phe 
145 150 155 160 
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Ser He Phe Gly 



Val Leu Gin Arg 
180 

Cys His Phe Met 
195 

Val Arg Gly He 
210 

Leu Pro Pro Pro 
225 

Val Gly Phe Tyr 



Pro Ser Phe Glu 
165 

Cys Glu Glu Ser 



Val Gin Glu Gly 
200 

Glu Val Asp Lys 
215 

Met He Val Lys 
230 

Arg Arg Phe He 
245 



Gly Tyr Leu Ser 
170 

Asn Leu Val Leu 
185 

He Val Leu Gly 



Ala Lys He Asp 
220 

Gly He Arg Ser 
235 

Lys Asp Phe Thr 
250 



Asn Leu Glu Arg 
175 

Asn Trp Glu Lys 
190 

His Lys He Ser 
205 

Val He Glu Lys 



Leu Leu Gly His 
240 

Lys Val 



<210> 156 
<211> 762 
<212> DNA 
<213> Glycine max 

<400> 156 

gtgcgtaagg aggtttttaa gttgctggaa gcaggtctta tttatcccat ttcggatagt 60 
gcatgggtta gccctgtgca ggttgtcccc aagaaagaag gtaagacagt cattaaggat 12 0 
gaaaaagatg agttgatatc cacaaggact atcaccgggt ggagaatgtg cattgactat 180 
cagaagctga atgatgccac ccggaaggac cattatccac tccctttcat ggaccaaatg 240 
cttgaaagac ttgccgggca atcttattat tgttttctgg atggatattc tggttataat 300 
cagattgatg tagatcccaa ggatcaagag aagactgctt tcacctaccc ttttggtgta 360 
ttcgcctatc ggcgcatgcc ctttggtttg tgcaatgccc cagctacatt tcagaggtgt 42 0 
atgatgacta ttttttctga tatggtggaa aaatgaattg aagttttcat ggacgatgtc 480 
tctatttttg ggccatcttt tgaagggtgc ttatcaaatc ttgaaagagt attaaagaga 54 0 
cgtgaagagt ccaaactagt tctcaattgg gagaaatgcc atttcatggt tcaagaagga 60 0 
atagtgttgg ggcataaaat ttcagtaaga gggatagagg tggacaaggc aaagattgat 66 0 
gtaatagaga aactacctcc tcccatgaat gtcaagggaa taagaagctt cctaggacat 72 0 
gcagggttct acaagcgatt catcaaagac ttctcaaaag tt 762 



<210> 157 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 157 

Val Arg Lys Glu Val Phe Lys Leu Leu Glu Ala Gly Leu He Tyr Pro 
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1 



5 



10 



15 



lie Ser Asp Ser Ala Trp Val Ser Pro Val Gin Val Val Pro Lys Lys 
20 25 30 

Glu Gly Lys Thr Val He Lys Asp Glu Lys Asp Glu Leu He Ser Thr 
35 40 45 

Arg Thr He Thr Gly Trp Arg Met Cys He Asp Tyr Gin Lys Leu Asn 
50 55 60 

Asp Ala Thr Arg Lys Asp His Tyr Pro Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly Gin Ser Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Asp Val Asp Pro Lys Asp Gin Glu Lys Thr 
100 105 110 

Ala Phe Thr Tyr Pro Phe Gly Val Phe Ala Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Met Thr He 
130 135 140 

Phe Ser Asp Met Val Glu Lys Glx He Glu Val Phe Met Asp Asp Val 
145 150 155 160 

Ser He Phe Gly Pro Ser Phe Glu Gly Cys Leu Ser Asn Leu Glu Arg 
165 170 175 

Val Leu Lys Arg Arg Glu Glu Ser Lys Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Gin Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

Val Arg Gly He Glu Val Asp Lys Ala Lys He Asp Val He Glu Lys 
210 215 220 

Leu Pro Pro Pro Met Asn Val Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Lys Arg Phe He Lys Asp Phe Ser Lys Val 
245 250 
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<210> 158 
<211> 761 
<212> DNA 
<213> Glycine max 

<400> 158 

gtgcggaagg aggttcttaa gctcctggaa 
gttgggtgag tccagtgcat gtggttccca 
agaaaaatga cctcattcta acccgaactg 
ggaagttgaa tgatgccatc aagaaggatc 
ttgagaggtt agcaagccag tctttctatt 
agattgctat acatcccaag gaccaagaga 
ttgcctatag aaggatgcca tttgaactat 
tgctagccat attcgctaac atggtggaga 
cggtgtttgg tccatccttt gtttgttgtt 
gtgaggagac aaatttagta ttgaattggg 
ttatgttggg gcataaaatt tttgctagag 
ttattgaaaa gctgcctcca ccagtcaatg 
ctggtttctt caggcgtttc atcaaggact 



gcagggctca tctatcttat ctcagatagt 60 
agaagggtgg gaagactgtg gtgagaaatg 12 0 
tcacaggatg gagaatgtgc atagattatc 180 
acttccctct accattcata gatcagatgc 240 
atttcttgga tgaatattct agatacaatc 300 
agattgcatt tacatgccca tttggtgtct 360 
gcaatgctcc agctaccttt tagaggcata 420 
aatgcatcga agtgttcata gatgattttt 480 
tgaccaattt agagctagtg ttgaagtact 540 
agaaatgtca tttcatggtc caagaaggaa 600 
gtattgaggt ggacaaggcc aaaattgatg 660 
taaaaggcat caggagtttt cttggacaca 720 
tcacaaaagt t 761 



<210> 159 
<211> 254 
<212> PRT 
<213> Glycine max 

<400> 159 

Val Arg Lys Glu Val Leu Lys Leu Leu Glu Ala Gly Leu lie Tyr Leu 
15 10 15 

He Ser Asp Ser Ala Trp Val Ser Pro Val His Val Val Pro Lys Lys 
20 25 30 

Gly Gly Lys Thr Val Val Arg Asn Glu Lys Asn Asp Leu He Leu Thr 
35 40 45 

Arg Thr Val Thr Gly Trp Arg Met Cys He Asp Tyr Arg Lys Leu Asn 
50 55 60 

Asp Ala He Lys Lys Asp His Phe Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Ser Gin Ser Phe Tyr Tyr Phe Leu Asp Glu Tyr 
85 90 95 

Ser Arg Tyr Asn Gin He Ala He His Pro Lys Asp Gin Glu Lys He 
100 105 110 
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Ala Phe Thr Cys 
115 

Glu Leu Cys Asn 
130 

Phe Ala Asn Met 
145 

Ser Val Phe Gly 



Val Leu Lys Tyr 
180 

Cys His Phe Met 
195 

Ala Arg Gly lie 
210 

Leu Pro Pro Pro 
225 

Thr Gly Phe Phe 



Pro Phe Gly Val 
120 

Ala Pro Ala Thr 
135 

Val Glu Lys Cys 
150 

Pro Ser Phe Val 
165 

Cys Glu Glu Thr 



Val Gin Glu Gly 
200 

Glu Val Asp Lys 
215 

Val Asn Val Lys 
230 

Arg Arg Phe lie 
245 



Phe Ala Tyr Arg 



Phe Glx Arg His 
14 0 

He Glu Val Phe 
155 

Cys Cys Leu Thr 
170 

Asn Leu Val Leu 
185 

He Met Leu Gly 



Ala Lys He Asp 
220 

Gly He Arg Ser 
235 

Lys Asp Phe Thr 
250 



Arg Met Pro Phe 
125 

Met Leu Ala lie 



lie Asp Asp Phe 
160 

Asn Leu Glu Leu 
175 

Asn Trp Glu Lys 
190 

His Lys He Phe 
205 

Val He Glu Lys 



Phe Leu Gly His 
240 

Lys Val 



<210> 160 
<211> 762 
<212> DNA 

<213> Pisum sativum 
<400> 160 

gtgcgcaagg aagtactcaa gttgttagat tcgggaatga tttaccccat ttctgacagc 60 
tcgtgggtaa gtccagtgca cgtggtacca aagaaaggag gaacctcagt aattttaaat 120 
gaaaagaatg aactgatccc aactcgcaca gtgacagggt ggcgagtatg catcgatcac 18 0 
agaagactga acacagcaac aagaaaggat cattttcctc tcccttttat tgatcaaatg 240 
ttagaaagac ttgcaggtca tgagtattat tgctttctgg atggatattc gggatacaat 3 00 
caaattgttg tagccccgga agatcaggaa aaaactgcat ttacatgtcc ttatggtatt 3 60 
ttcgcttaca gacggatgcc atttgggcta tgcaatgccc cagctacttt tcagaggtgt 42 0 
atgacatcta tattctccga catgcttgaa aagtatatga aggtgtttat ggatgatttc 480 
tctgtgtttg gttcttcttt tgataattgt ttagctaact tgtctcttgt tttgcaaaga 54 0 
tgtcaggaaa ctaaccttgt tctcaattgg gagaaatgtc atttcatggt gcaggaagga 600 
attgtgctag gacacaaaat ttcccacaaa ggaattgaag tggacaaagc caaagtggag 660 
gttatagcta acctcccacc tccggtgaat gaaaaaggga taaggagttt tttgggtcat 72 0 
gcaggttttt atcgcaggtt catcaaagac ttcacaaagg tt 762 
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<210> 161 
<211> 254 
<212> PRT 

<213> Pisum sativum 



<400> 161 

Val Arg Lys Glu Val Leu Lys Leu Leu Asp Ser Gly Met He Tyr Pro 
15 10 15 

He Ser Asp Ser Ser Trp Val Ser Pro Val His Val Val Pro Lys Lys 
20 25 30 

Gly Gly Thr Ser Val He Leu Asn Glu Lys Asn Glu Leu He Pro Thr 
35 40 45 

Arg Thr Val Thr Gly Trp Arg Val Cys He Asp His Arg Arg Leu Asn 
50 55 60 

Thr Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe He Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ala Gly His Glu Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Val Val Ala Pro Glu Asp Gin Glu Lys Thr 
100 105 110 

Ala Phe Thr Cys Pro Tyr Gly He Phe Ala Tyr Arg Arg Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Met Thr Ser He 
130 135 140 

Phe Ser Asp Met Leu Glu Lys Tyr Met Lys Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Phe Gly Ser Ser Phe Asp Asn Cys Leu Ala Asn Leu Ser Leu 
165 170 175 

Val Leu Gin Arg Cys Gin Glu Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Gin Glu Gly He Val Leu Gly His Lys He Ser 
195 200 205 

His Lys Gly He Glu Val Asp Lys Ala Lys Val Glu Val He Ala Asn 
210 215 220 
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Leu Pro Pro Pro Val Asn Glu Lys Gly lie Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe lie Lys Asp Phe Thr Lys Val 
245 250 



<210> 162 
<211> 762 
<212> DMA 

<213> Pisum sativum 
<400> 162 

gtgcgtaagg aggtctttaa actattggat 
ccgtgggtta gtcccgtgca cgtggttccg 
gacaaagacg aattgatccc gactaaagtt 
agacagttga ataccgcgac tcgaaaggac 
cttgaaagac tatcgggcca acaatactat 
caaattgcgg ttgacccggt tgatcatgag 
ttcgcataca gaaaaatgcc ctttgggctg 
gtcctagcca tttttgccga tctaatagag 
tcggtatttg gtgggacgtt tagtctatgc 
tgtgtgaaga ccaatttggt gctaaattgg 
atcgtgctag gccacaaagt ctctaaaagg 
gtaattgaaa aattaccccc tccggtgaat 
9cggggtttt accggcgctt cattaaagac 



gcgggaatga tttacccgat ctcggatagt 60 
aagaagggtg gaatgaccgt aatccgtaat 120 
gcaacggggt ggagaatatg tatagattat 180 
cattttccac tcccatttat ggatcaaatg 240 
tgtttcttgg acggctactc cgggtacaac 3 00 
aagacggctt tcacgtgtcc gtttggagtg 3 60 
tgcaatgcac cggcgacttt ccaacgatgc 42 0 
aaaacaatgg acgtcttcat ggatgacttc 480 
ttggcaaatt tgaagacggt gttggaaagg 540 
gaaaagtgtc acttcatggt gaccgagggg 600 
gggcttgaag tggatagagc taaggttgaa 660 
gtgaaaggca tccgtagctt tttggggcac 72 0 
ttctcaaaag tt 762 



<210> 163 
<211> 254 
<212> PRT 

<213> Pisum sativum 
<400> 163 

Val Arg Lys Glu Val Phe Lys Leu Leu Asp Ala Gly Met lie Tyr Pro 
15 10 15 

lie Ser Asp Ser Pro Trp Val Ser Pro Val His Val Val Pro Lys Lys 
20 25 30 

Gly Gly Met Thr Val lie Arg Asn Asp Lys Asp Glu Leu lie Pro Thr 
35 40 45 

Lys Val Ala Thr Gly Trp Arg lie Cys lie Asp Tyr Arg Gin Leu Asn 
50 55 60 

Thr Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe Met Asp Gin Met 
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65 



70 



75 



80 



Leu Glu Arg Leu Ser Gly Gin Gin Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Ala Val Asp Pro Val Asp His Glu Lys Thr 
100 105 110 

Ala Phe Thr Cys Pro Phe Gly Val Phe Ala Tyr Arg Lys Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Val Leu Ala He 
130 135 140 

Phe Ala Asp Leu He Glu Lys Thr Met Asp Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Phe Gly Gly Thr Phe Ser Leu Cys Leu Ala Asn Leu Lys Thr 
165 170 175 

Val Leu Glu Arg Cys Val Lys Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 

Cys His Phe Met Val Thr Glu Gly He Val Leu Gly His Lys Val Ser 
195 200 205 

Lys Arg Gly Leu Glu Val Asp Arg Ala Lys Val Glu Val He Glu Lys 
210 215 220 

Leu Pro Pro Pro Val Asn Val Lys Gly He Arg Ser Phe Leu Gly His 
225 230 235 240 

Ala Gly Phe Tyr Arg Arg Phe He Lys Asp Phe Ser Lys Val 
245 250 



<210> 164 
<211> 762 
<212> DNA 

<213> Pisum sativum 
<400> 164 

gtgcggaagg aggtctttaa attgttggat 
ccatgggtta gtcctgtgca cgttgttccg 
gacaaggatg aattgatccc cactaaagtt 
aggcggttga ataccgcgac tcgaaaagac 
ctcgaaagac tatcgggcca acaatattat 
caaattgcgg ttgacccggc cgatcatgag 



gcggggatga tttacccgat ctcggatagt 60 
aagaaggggg ggattaccgt aatccggaat 12 0 
gaaacggggt ggagaatgtg tattgattat 180 
cattttccac tcccatttat ggatcaaatg 240 
tgttttttgg acggctactc cgggtacaac 3 00 
aagacggctt tcacatgtcc gtttggagtg 360 
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ttcgcatacc gaaaaatgcc ctttgggctg 
gtccaagcca tttttgtcga tctgatagag 
tcggtatttg gtgggtcttt tagtctatgc 
tgtgtgaaga ccaatttggt gcttaattgg 
atcgtgctag gccacaaagt ctctagaagg 
gtgatagaaa aattacctcc tccggtgaat 
gccgggttct accggcgctt cattaaagat 



tgcaatgcac cggcgacctt ccaacgatgt 42 0 
aaaacaatgg aagtcttcat ggatgacttc 48 0 
ttggcgaact tgaaaacggt gttggagaga 54 0 
gagaagtgtc acttcatggt gaccgagggg 600 
gggcttgaag tggatagagc taaggttgaa 660 
gtgaagggca tccgaagctt tttggggcac 72 0 
ttcacaaagg tt 762 



<210> 165 
<211> 254 
<212> PRT 

<213> Pisum sativum 
<400> 165 

Val Arg Lys Glu Val Phe Lys Leu Leu Asp Ala Gly Met lie Tyr Pro 
15 10 15 

lie Ser Asp Ser Pro Trp Val Ser Pro Val His Val Val Pro Lys Lys 
20 25 30 

Gly Gly lie Thr Val lie Arg Asn Asp Lys Asp Glu Leu lie Pro Thr 
35 40 45 

Lys Val Glu Thr Gly Trp Arg Met Cys He Asp Tyr Arg Arg Leu Asn 
50 55 60 

Thr Ala Thr Arg Lys Asp His Phe Pro Leu Pro Phe Met Asp Gin Met 
65 70 75 80 

Leu Glu Arg Leu Ser Gly Gin Gin Tyr Tyr Cys Phe Leu Asp Gly Tyr 
85 90 95 

Ser Gly Tyr Asn Gin He Ala Val Asp Pro Ala Asp His Glu Lys Thr 
100 105 110 

Ala Phe Thr Cys Pro Phe Gly Val Phe Ala Tyr Arg Lys Met Pro Phe 
115 120 125 

Gly Leu Cys Asn Ala Pro Ala Thr Phe Gin Arg Cys Val Gin Ala He 
130 135 140 

Phe Val Asp Leu He Glu Lys Thr Met Glu Val Phe Met Asp Asp Phe 
145 150 155 160 

Ser Val Phe Gly Gly Ser Phe Ser Leu Cys Leu Ala Asn Leu Lys Thr 
165 170 175 
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Val Leu Glu Arg Cys Val Lys Thr Asn Leu Val Leu Asn Trp Glu Lys 
180 185 190 



Cys His Phe Met Val Thr Glu Gly 
195 200 

Arg Arg Gly Leu Glu Val Asp Arg 
210 215 

Leu Pro Pro Pro Val Asn Val Lys 
225 230 

Ala Gly Phe Tyr Arg Arg Phe lie 
245 



He Val Leu Gly His Lys Val Ser 
205 

Ala Lys Val Glu Val He Glu Lys 
220 

Gly He Arg Ser Phe Leu Gly His 
235 240 

Lys Asp Phe Thr Lys Val 
250 
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